Computer Science ›› 2026, Vol. 53 ›› Issue (5): 276-285.doi: 10.11896/jsjkx.250400141

• Artificial Intelligence • Previous Articles     Next Articles

Boosting Generative Rule Extraction via Negative-aware Approach

JI Wendi1,2, WANG Yongquan1,2, SHEN Yicheng1   

  1. 1 School of Law and Criminal Justice, East China University of Political Science and Law, Shanghai 201620, China
    2 Department of Intelligent Science and Information Law, East China University of Political Science and Law, Shanghai 201620, China
  • Received:2025-04-28 Revised:2025-07-27 Published:2026-05-08
  • About author:JI Wendi,born in 1988,Ph.D,lecturer,is a member of CCF(No.D8438M).Her main research interests include na-tural language processing,information retrieval and computational law.
    WANG Yongquan,born in 1964,Ph.D,professor,Ph.D supervisor.His main research interests include big data and artificial intelligence,cyberspace security and cybercrime,digital forensics.
  • Supported by:
    National Key Research and Development Program of China(2023YFC3306100,2023YFC3306103,2023YFC3306105).

Abstract: Legal rules are behavior norms formulated by competent authorities with binding legal force and effectiveness,essential to maintaining social order.As the preliminary of Legal AI,numerous studies have attempted to convert natural-language legal texts into machine-readable rule sets,however the results remain unsatisfactory.To address the challenges of formal representation and extraction of legal rules,this paper proposes a systematic legal rule extraction paradigm and introduces a negative-aware approach to boosting generative rule extraction with large language models(LLMs).The paradigm defines a legal rule schema by decomposing a legal rule into a tetrad of subject,object,conditions and consequences,thus clarifying its applicability,targets and effects.Building on this,this paper proposes a generative legal rule extraction enhancement method leveraging LLMs,which incorporates the concept of “learning from errors” by constructing a negative-aware training framework to improve the model’s ability to recognize hard negative cases and mitigate hallucination issues in generative rule extraction.Experimental results show that the rule extraction model based on Mistral-Small-24B(a mid-size LLM) outperforms the general-purpose LLM(Deepseek-r1) by 18.23% and even surpasses human-annotated performance by 1.5%,demonstrating that the negative-aware training framework significantly enhances the rule extraction capability of the model.

Key words: Rule extraction, Negative-aware training, Large language model, Corpus construction, Supervised fine-tuning

CLC Number: 

  • TP391
[1]OLIVEIRA F D,JOSE M O.A RDF-based graph to representing and searching parts of legal documents[J].Artificial Intelligence and Law,2024,3(32):667-695.
[2]LOVE N,MICHAEL G.Computational law[C]//Proceedings of the 10th International Conference on Artificial Intelligence and Law.2005:205-209.
[3]LI J,LU Q,LIU P,et al.Construction of Legal Knowledge Graph Based on Knowledge-Enhanced Large Language Models[J].Information,2024,11:666.
[4]NEEL G,NYARKO J,HO D,et al.Legalbench:A collaboratively built benchmark for measuring legal reasoning in large language models[J].Advances in Neural Information Proces-sing Systems,2023,36:44123-44279.
[5]ANTOINE L,DIJCK G,GERASIMOS S.Interpretable long-form legal question answering with retrieval-augmented large language models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:22266-22275.
[6]HOLLI S,IZZIDIEN A,STEFFEK F.Topic classification ofcase law using a large language model and a new taxonomy for UK law:AI insights into summary judgment[J/OL].Artificial Intelligence and Law.https://doi.org/10.1007/S10506-025-09434-0.
[7]KATZ D M,BOMMARITO M J,GAO S,et al.Gpt-4 passes the bar exam[J].Philosophical Transactions of the Royal Society,2024,2270:2023-0254.
[8]MARTÍNEZ E.Re-evaluating GPT-4’s bar exam performance[J].Artificial Intelligence and Law,2025,33:581-604.
[9]CHEONG I,XIA K,FENG K,et al.(A) I am not A lawyer,but…:engaging legal experts towards responsible LLM policies for legal advice[C]//Proceedings of the 2024 ACM Conference on Fairness,Accountability,and Transparency.2024,2454-2469.
[10]ATKINSON K,TREVOR B,DANUSHKA B.Explanation inAI and law:Past,present and future[J].Artificial Intelligence,2020,289:103-387.
[11]SHI J,GUO Q,LIAO Y,et al.Legal-lm:Knowledge graph enhanced large language models for law consulting[C]//In International Conference on Intelligent Computing.2024:175-186.
[12]CHEN H,ZHANG L,LIU Y,et al.Rethinking the Develop-ment of Large Language Models from the Causal Perspective:A Legal Text Prediction Case Study[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:20958-20966.
[13]TANG Y,QIU R,YIN H,et al.Caselink:Inductive graph lear-ning for legal case retrieval[C]//Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval.2024:2199-2209.
[14]RICHMOND K M,MUDDAMSETTY S M,GAMMELTOFT-HANSEN T,et al.Explainable AI and law:An evidential survey[J].Digital Society,2024,3,(1):1.
[15]HE Q,TANG Y,HUANG Y.The Research and Implementation of Ontology-based Legal Knowledge Base[J].Computer Science,2007,34(2):175-177.
[16]SPANGHER A,XUE Z,WU T,et al.LegalDiscourse:Interpreting When Laws Apply and To Whom[C]//Proceedings of the 2024 Conference of the NAACL.2024:8528-8551.
[17]WYNER A,PETERS W.On rule extraction from regulations[M]//Legal Knowledge and Information Systems.IOS Press,2011:113-122.
[18]WANG S,WANG Y,WANG Q,et al.Research on detection of Android application privacy compliance[J].Selected Excellent Works of Information Security Competition,2023,42(1):4-14.
[19]MORENO S J,REHM G,MONTIEL-PONSODA E,et al.Lynx:A knowledge-based AI service platform for content processing,enrichment and analysis for the legal domain[J].Information Systems,2022(106):101966.
[20]LI X,PAN Z,DENG Q,et al.Relation Enhanced Embedding Based Entities Relation Extraction from Legal Documents[J].Journal of Chinese Information Processing,2023,37(4):90-97.
[21]WEI J,BOSMA M,ZHAO V Y,et al.Finetuned language mo-dels are zero-shot learners[C]//International Conference on Learning Representations.2021.
[22]LI J,LU Q,LIU P,et al.Construction of Legal KnowledgeGraph Based on Knowledge-Enhanced Large Language Models[J].Information,2024,15(11):666.
[23]TONG S,YUAN J,ZHANG P,et al.Legal Judgment Prediction via graph boosting with constraints[J].Information Process Management,2024,61:103663.
[24]XU D,CHEN W,PENG W,et al.Large language models for generative information extraction:A survey[J].Frontiers of Computer Science,2024,18(6):186357.
[25]WEI X,CUI X,CHENG N,et al.Chatie:Zero-shot informationextraction via chatting with chatgpt[J].arXiv:2302.10205,2023.
[26]OLIVEIRA V,NOGUEIRA G,FALEIROS T,et al.Combining prompt-based language models and weak supervision for labeling named entity recognition on legal documents[J].Artificial Intelligence and Law,2025,33:361-381.
[27]ADIBHATLA H,PAVAN B,MANISH S.Fine-grained Con-tract NER using instruction based mode[C]//Proceedings of the 37th Pacific Asia Conference on Language,Information and Computation.2023:889-902.
[28]GUI H,YUAN L,YE H,et al.IEPile:unearthing large scale schema-conditioned information extraction corpus[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics.2024:27-146.
[29]HUANG J,YAN D,CAI Y.PMRC:Prompt-Based MachineReading Comprehension for Few-Shot Named Entity Recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:18316-18326.
[30]DAGDELEN J,DUNN A,LEE S,et al.Structured information extraction from scientific text with large language models[J].Nature Communications,2024,15(1):1418.
[31]LEI L.Logical Structure of a Legal Rule[J].Chinese Journal of Low,2013(1):66-86.
[1] LAI Hua, GUO Zirui,LI Ying, YU Zhengtao. Construction of Chinese-Burmese Machine Translation Corpus Based on Pivot OptimizationSelf-training [J]. Computer Science, 2026, 53(5): 268-275.
[2] HAN Linrui, ZHENG Ri, CONG Yingnan. Explainable Sentencing Prediction Method Driven by Sentencing Rule Knowledge Graph [J]. Computer Science, 2026, 53(5): 286-298.
[3] LIU Xukai, LIU Yang, HUANG Haozhen. EC-MIIP:Efficient Fine-tuning Small-parameter Large Language Model for Intellectual Property [J]. Computer Science, 2026, 53(5): 299-308.
[4] WANG Shenghui, LI Teng. Innovative Automated Scoring Based on Large Language Models [J]. Computer Science, 2026, 53(5): 90-98.
[5] LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[6] WU Qiaorui, LUO Li, ZHAO Cairong. LLM-augmented Training Framework with Cycle-Consistency Constraints [J]. Computer Science, 2026, 53(4): 377-383.
[7] HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao. Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives [J]. Computer Science, 2026, 53(4): 1-23.
[8] WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research [J]. Computer Science, 2026, 53(3): 295-306.
[9] XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[10] LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[11] WANG Zhibin, LI Shipeng, ZHOU Yuhang, LI Xue, ZHANG Zhonghui, JIANG Zhiwei, GU Rong, TIAN Chen, CHEN Guihai, ZHONG Sheng. Optimization of Service Level Objectives and System Level Metrics in Large Language ModelServing System [J]. Computer Science, 2026, 53(3): 23-32.
[12] ZHOU Yueyuan, LU Guanze, XIANG Jiawei, ZHANG Jiawei, SHAO En, HE Xin. Training System for Large Language Models Based on Adaptive Transpose on Hygon DCU [J]. Computer Science, 2026, 53(3): 33-40.
[13] CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51.
[14] CHEN Yuyin, LI Guanfeng, QIN Jing, XIAO Yuhang. Survey on Complex Logical Query Methods in Knowledge Graphs [J]. Computer Science, 2026, 53(2): 273-288.
[15] GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems [J]. Computer Science, 2026, 53(1): 1-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!