计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 337-342.doi: 10.11896/jsjkx.201100212

• 人工智能 • 上一篇    下一篇

基于非均衡数据层次学习的案件案由预测方法

曲浩1, 崔超然2, 王萧萧2, 苏雅茜2, 韩晓晖3, 尹义龙1   

  1. 1 山东大学软件学院 济南250101
    2 山东财经大学计算机科学与技术学院 济南250014
    3 山东省计算中心(国家超级计算济南中心) 济南250014
  • 收稿日期:2020-11-28 修回日期:2021-04-14 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 尹义龙(ylyin@sdu.edu.cn)
  • 作者简介:quhao_mla@163.com
  • 基金资助:
    国家重点研发计划(2018YFC0830100,2018YFC0830102)

Hierarchical Learning on Unbalanced Data for Predicting Cause of Action

QU Hao1, CUI Chao-ran2, WANG Xiao-xiao2, SU Ya-xi2, HAN Xiao-hui3, YIN Yi-long1   

  1. 1 School of Software,Shandong University,Jinan 250101,China
    2 College of Computer Science and Technology,Shandong University of Finance and Economics,Jinan 250014,China
    3 Shandong Computing Center(National Supercomputing Jinan Center),Jinan 250014,China
  • Received:2020-11-28 Revised:2021-04-14 Online:2021-12-15 Published:2021-11-26
  • About author:QU Hao,born in 1997,postgraduate.His main research interests include machine learning,natural language processing.
    YIN Yi-long,born in 1972,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include machine learning,data mining,and biometrics.
  • Supported by:
    National key R & D Plan Project(2018YFC0830100,2018YFC0830102).

摘要: 案件案由是对案件所涉及法律关系性质的描述,科学、完善的案由设置有利于正确适用法律,是人民法院实行案件分类管理的重要途径。案件案由预测技术指基于案件案情的文本描述由计算机自动给出案件所属类别。在案件属性预测研究中,由于低频类别的样本数量较少且难以学习相关特征,因此已有方法在数据处理部分通常会对低频类别样本进行剔除。然而,在案件案由预测问题中,关键的挑战正是如何对属于低频案由的案件做出准确预测。为此,文中提出了一种基于非均衡数据层次学习的案件案由预测方法。在案件案由预测中,根据案由层次结构将案由划分为一级案由和二级案由,二级案由中的大量尾部类别被汇聚成上层样本数较多的大类,进而通过层次学习的方式来实现二级案由的预测,使二级案由有一级案由的信息支撑。最后,引入调整数据不平衡的损失函数来实现案件案由的预测。实验结果表明,所提方法整体优于对比方法,其平均精确率比现有方法提高了4.81%,这表明通过层次学习和引入非均衡数据损失函数能较好地解决案件案由预测问题。

关键词: 案由预测, 层次学习, 非均衡数据, 损失函数

Abstract: The cause of action represents the nature of the legal relationships involved in the case.A scientific and rational choice of the cause of action will facilitate the correct application of laws and enable the courts to perform classification management of cases.Cause of action prediction aims to endow computers with the ability to automatically predict the cause category based on the textual case description.Due to the small number of the samples of low-frequency categories and the difficulty of learning effective features,previous methods usually filters out the samples of low-frequency category in data preprocessing.However,in the problem of predicting the cause of action,the key challenge is how to make an accurate prediction for the cases of low-frequency cause categories.To solve this problem,in this paper,we propose a novel hierarchical learning method based on unbalanced samples for predicting cause of action.Firstly,all causes are divided into the first-level and second-level causes according to their inherent hierarchical structure.Then,the tailed ones in second-level causes can be merged into a new first-level category with sufficient samples,and the hierarchical learning is applied to realize the prediction of cause of action.Finally,we refine the loss function to alleviate the problem of data imbalance.Experimental results show that the proposed method is significantly superior over the baseline methods,leading to an improvement of 4.81% in terms of accuracy.Also,we verify the benefits of introducing the hierarchical learning as well as refining the loss function for unbalanced data.

Key words: Cause of action prediction, Hierarchical learning, Loss function, Unbalanced data

中图分类号: 

  • TP391
[1]CHEN H,CAI D,DAI W,et al.Charge-Based Prison Term Prediction with Deep Gating Network[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:6363-6368.
[2]JIANG X,YE H,LUO Z,et al.Interpretable rationale augmented charge prediction system[C]//Proceedings of the 27th International Conference on Computational Linguistics:System Demonstrations.2018:146-151.
[3]LIU C L,HSIEH C D.Exploring phrase-based classification of judicial documents for criminal charges in chinese[C]//International Symposium on Methodologies for Intelligent Systems.Berlin:Springer,2006:681-690.
[4]LUO B,FENG Y,XU J,et al.Learning to Predict Charges for Criminal Cases with Legal Basis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Proces-sing.2017:2727-2736.
[5]ZHONG H,GUO Z,TU C,et al.Legal judgment prediction via topological learning[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3540-3549.
[6]LIU C L,CHANG C T,HO J H.Case Instance Generation and Refinement for Case-Based Criminal Summary Judgments in Chinese[J].Journal of Information Science and Engineering,2004,20(4):783-800.
[7]YANG W,JIA W,ZHOU X,et al.Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network[C]//Twenty-Eighth InternationalJoint Conference on Artificial Intelligence IJCAI-19.2019.
[8]XU N,WANG P,CHEN L,et al.Distinguish Confusing Law Articles for Legal Judgment Prediction[J].arXiv:2004.02557.
[9]HU Z,LI X,TU C,et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:487-498.
[10]YE H,JIANG X,LUO Z,et al.Interpretable charge predictions for criminal cases:Learning to generate court views from fact descriptions[J].arXiv:1802.08504.
[11]KATZ D M,BOMMARITO M J,BLACKMAN J.A general approach for predicting the behavior of the Supreme Court of the United States[J].PloS One,2017,12(4):e0174698.
[12]KENDALL M G,STUART A,ORD J K.Kendall's advanced theory of statistics[M].Oxford University Press,Inc.,1987.
[13]VAN HORN G,PERONA P.The devil is in the tails:Fine-grained classification in the wild[J].arXiv:1709.01450.
[14]HU Q H,WANG Y,ZHOU Y C,et al.A Review of Hierarchical Learning Methods for Large-scale Categorization Tasks[J].Scientia Sinica Informationis,2018(5):2.
[15]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[16]HAN H,WANG W Y,MAO B H.Borderline-SMOTE:a new over-sampling method in imbalanced data sets learning[C]//International Conference on Intelligent Computing.Berlin:Sprin-ger,2005:878-887.
[17]DRUMMOND C,HOLTE R C.C4.5,class imbalance,and cost sensitivity:why under-sampling beats over-sampling[C]//Workshop on Learning from Imbalanced Datasets II.Washington DC:Citeseer,2003,11:1-8.
[18]SHEN L,LIN Z,HUANG Q.Relay backpropagation for effective learning of deep convolutional neural networks[C]//European Conference on Computer Vision.Cham:Springer,2016:467-482.
[19]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Ex- ploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:181-196.
[20]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[21]KHAN S H,HAYAT M,BENNAMOUN M,et al.Cost-sensitive learning of deep feature representations from imbalanced data[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(8):3573-3587.
[22]CAO K,WEI C,GAIDON A,et al.Learning imbalanced data- sets with label-distribution-aware margin loss[C]//Advances in Neural Information Processing Systems.2019:1567-1578.
[23]HUANG C,LI Y,CHEN C L,et al.Deep imbalanced learning for face recognition and attribute prediction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(11):2781-2794.
[24]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[25]SHU J,XIE Q,YI L,et al.Meta-weight-net:Learning an expli- cit mapping for sample weighting[C]//Advances in Neural Information Processing Systems.2019:1919-1930.
[26]REN M,ZENG W,YANG B,et al.Learning to Reweight Examples for Robust Deep Learning[C]//International Conference on Machine Learning.2018:4334-4343.
[27]KHAN S,HAYAT M,ZAMIR S W,et al.Striking the right balance with uncertainty[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2019:103-112.
[28]HAYAT M,KHAN S,ZAMIR W,et al.Max-margin class imbalanced learning with gaussian affinity[J].arXiv:1901.07711.
[29]CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1724-1734.
[30]MENON A K,JAYASUMANA S,RAWAT A S,et al.Long-tail learning via logit adjustment[J].arXiv:2007.07314.
[31]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751.
[32]JOHNSON R,ZHANG T.Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:562-570.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[1] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[2] 高荣华, 白强, 王荣, 吴华瑞, 孙想.
改进注意力机制的多叉树网络多作物早期病害识别方法
Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism
计算机科学, 2022, 49(6A): 363-369. https://doi.org/10.11896/jsjkx.210500044
[3] 江昊琛, 魏子麒, 刘璘, 陈俊.
非均衡数据分类经典方法综述与面向医疗领域的实验分析
Imbalanced Data Classification:A Survey and Experiments in Medical Domain
计算机科学, 2022, 49(1): 80-88. https://doi.org/10.11896/jsjkx.210200124
[4] 黄颖琦, 陈红梅.
基于代价敏感卷积神经网络的非平衡问题混合方法
Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification
计算机科学, 2021, 48(9): 77-85. https://doi.org/10.11896/jsjkx.200900013
[5] 张晓宇, 王彬, 安卫超, 阎婷, 相洁.
基于融合损失函数的3D U-Net++脑胶质瘤分割网络
Glioma Segmentation Network Based on 3D U-Net++ with Fusion Loss Function
计算机科学, 2021, 48(9): 187-193. https://doi.org/10.11896/jsjkx.200800099
[6] 冯姣, 陆昶谕.
基于残差注意力网络的跨媒体检索方法
Cross Media Retrieval Method Based on Residual Attention Network
计算机科学, 2021, 48(6A): 122-126. https://doi.org/10.11896/jsjkx.201100026
[7] 段菲, 王慧敏, 张超.
面向数据表示的Cauchy非负矩阵分解
Cauchy Non-negative Matrix Factorization for Data Representation
计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195
[8] 石先让, 宋廷伦, 唐得志, 戴振泳.
一种新颖的单目视觉深度学习算法:H_SFPN
Novel Deep Learning Algorithm for Monocular Vision:H_SFPN
计算机科学, 2021, 48(4): 130-137. https://doi.org/10.11896/jsjkx.200400090
[9] 穆逢君, 邱静, 陈路锋, 黄瑞, 周林, 于功敬.
面向人机协同的物体姿态估计帧间稳定性优化方法
Optimization Method for Inter-frame Stability of Object Pose Estimation for Human-Machine Collaboration
计算机科学, 2021, 48(11): 226-233. https://doi.org/10.11896/jsjkx.201200095
[10] 孟丽莎, 任坤, 范春奇, 黄泷.
基于密集卷积生成对抗网络的图像修复
Dense Convolution Generative Adversarial Networks Based Image Inpainting
计算机科学, 2020, 47(8): 202-207. https://doi.org/10.11896/jsjkx.190700017
[11] 景雨, 祁瑞华, 刘建鑫, 刘朝霞.
基于改进多尺度深度卷积网络的手势识别算法
Gesture Recognition Algorithm Based on Improved Multiscale Deep Convolutional Neural Network
计算机科学, 2020, 47(6): 180-183. https://doi.org/10.11896/jsjkx.200200030
[12] 王立华,杜明辉,梁亚玲.
基于角度特征的分类网络
Classification Net Based on Angular Feature
计算机科学, 2020, 47(2): 83-87. https://doi.org/10.11896/jsjkx.190500077
[13] 王丽星, 曹付元.
基于Huber损失的非负矩阵分解算法
Huber Loss Based Nonnegative Matrix Factorization Algorithm
计算机科学, 2020, 47(11): 80-87. https://doi.org/10.11896/jsjkx.190900144
[14] 刘颖.
供应链金融大数据分布特征的分析与洞见
Big Data Analytics and Insights in Distribution Characteristics of Supply Chain Finance
计算机科学, 2019, 46(2): 1-10. https://doi.org/10.11896/j.issn.1002-137X.2019.02.001
[15] 鲁淑霞, 蔡莲香, 张罗幻.
基于零阶减小方差方法的鲁棒支持向量机
Robust SVM Based on Zeroth Order Variance Reduction
计算机科学, 2019, 46(11): 193-201. https://doi.org/10.11896/jsjkx.181001840
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!