计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 337-342.doi: 10.11896/jsjkx.201100212

• 人工智能 • 上一篇    下一篇

基于非均衡数据层次学习的案件案由预测方法

曲浩1, 崔超然2, 王萧萧2, 苏雅茜2, 韩晓晖3, 尹义龙1   

  1. 1 山东大学软件学院 济南250101
    2 山东财经大学计算机科学与技术学院 济南250014
    3 山东省计算中心(国家超级计算济南中心) 济南250014
  • 收稿日期:2020-11-28 修回日期:2021-04-14 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 尹义龙(ylyin@sdu.edu.cn)
  • 作者简介:quhao_mla@163.com
  • 基金资助:
    国家重点研发计划(2018YFC0830100,2018YFC0830102)

Hierarchical Learning on Unbalanced Data for Predicting Cause of Action

QU Hao1, CUI Chao-ran2, WANG Xiao-xiao2, SU Ya-xi2, HAN Xiao-hui3, YIN Yi-long1   

  1. 1 School of Software,Shandong University,Jinan 250101,China
    2 College of Computer Science and Technology,Shandong University of Finance and Economics,Jinan 250014,China
    3 Shandong Computing Center(National Supercomputing Jinan Center),Jinan 250014,China
  • Received:2020-11-28 Revised:2021-04-14 Online:2021-12-15 Published:2021-11-26
  • About author:QU Hao,born in 1997,postgraduate.His main research interests include machine learning,natural language processing.
    YIN Yi-long,born in 1972,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include machine learning,data mining,and biometrics.
  • Supported by:
    National key R & D Plan Project(2018YFC0830100,2018YFC0830102).

摘要: 案件案由是对案件所涉及法律关系性质的描述,科学、完善的案由设置有利于正确适用法律,是人民法院实行案件分类管理的重要途径。案件案由预测技术指基于案件案情的文本描述由计算机自动给出案件所属类别。在案件属性预测研究中,由于低频类别的样本数量较少且难以学习相关特征,因此已有方法在数据处理部分通常会对低频类别样本进行剔除。然而,在案件案由预测问题中,关键的挑战正是如何对属于低频案由的案件做出准确预测。为此,文中提出了一种基于非均衡数据层次学习的案件案由预测方法。在案件案由预测中,根据案由层次结构将案由划分为一级案由和二级案由,二级案由中的大量尾部类别被汇聚成上层样本数较多的大类,进而通过层次学习的方式来实现二级案由的预测,使二级案由有一级案由的信息支撑。最后,引入调整数据不平衡的损失函数来实现案件案由的预测。实验结果表明,所提方法整体优于对比方法,其平均精确率比现有方法提高了4.81%,这表明通过层次学习和引入非均衡数据损失函数能较好地解决案件案由预测问题。

关键词: 案由预测, 非均衡数据, 层次学习, 损失函数

Abstract: The cause of action represents the nature of the legal relationships involved in the case.A scientific and rational choice of the cause of action will facilitate the correct application of laws and enable the courts to perform classification management of cases.Cause of action prediction aims to endow computers with the ability to automatically predict the cause category based on the textual case description.Due to the small number of the samples of low-frequency categories and the difficulty of learning effective features,previous methods usually filters out the samples of low-frequency category in data preprocessing.However,in the problem of predicting the cause of action,the key challenge is how to make an accurate prediction for the cases of low-frequency cause categories.To solve this problem,in this paper,we propose a novel hierarchical learning method based on unbalanced samples for predicting cause of action.Firstly,all causes are divided into the first-level and second-level causes according to their inherent hierarchical structure.Then,the tailed ones in second-level causes can be merged into a new first-level category with sufficient samples,and the hierarchical learning is applied to realize the prediction of cause of action.Finally,we refine the loss function to alleviate the problem of data imbalance.Experimental results show that the proposed method is significantly superior over the baseline methods,leading to an improvement of 4.81% in terms of accuracy.Also,we verify the benefits of introducing the hierarchical learning as well as refining the loss function for unbalanced data.

Key words: Cause of action prediction, Unbalanced data, Hierarchical learning, Loss function

中图分类号: 

  • TP391
[1]CHEN H,CAI D,DAI W,et al.Charge-Based Prison Term Prediction with Deep Gating Network[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:6363-6368.
[2]JIANG X,YE H,LUO Z,et al.Interpretable rationale augmented charge prediction system[C]//Proceedings of the 27th International Conference on Computational Linguistics:System Demonstrations.2018:146-151.
[3]LIU C L,HSIEH C D.Exploring phrase-based classification of judicial documents for criminal charges in chinese[C]//International Symposium on Methodologies for Intelligent Systems.Berlin:Springer,2006:681-690.
[4]LUO B,FENG Y,XU J,et al.Learning to Predict Charges for Criminal Cases with Legal Basis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Proces-sing.2017:2727-2736.
[5]ZHONG H,GUO Z,TU C,et al.Legal judgment prediction via topological learning[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3540-3549.
[6]LIU C L,CHANG C T,HO J H.Case Instance Generation and Refinement for Case-Based Criminal Summary Judgments in Chinese[J].Journal of Information Science and Engineering,2004,20(4):783-800.
[7]YANG W,JIA W,ZHOU X,et al.Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network[C]//Twenty-Eighth InternationalJoint Conference on Artificial Intelligence IJCAI-19.2019.
[8]XU N,WANG P,CHEN L,et al.Distinguish Confusing Law Articles for Legal Judgment Prediction[J].arXiv:2004.02557.
[9]HU Z,LI X,TU C,et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:487-498.
[10]YE H,JIANG X,LUO Z,et al.Interpretable charge predictions for criminal cases:Learning to generate court views from fact descriptions[J].arXiv:1802.08504.
[11]KATZ D M,BOMMARITO M J,BLACKMAN J.A general approach for predicting the behavior of the Supreme Court of the United States[J].PloS One,2017,12(4):e0174698.
[12]KENDALL M G,STUART A,ORD J K.Kendall's advanced theory of statistics[M].Oxford University Press,Inc.,1987.
[13]VAN HORN G,PERONA P.The devil is in the tails:Fine-grained classification in the wild[J].arXiv:1709.01450.
[14]HU Q H,WANG Y,ZHOU Y C,et al.A Review of Hierarchical Learning Methods for Large-scale Categorization Tasks[J].Scientia Sinica Informationis,2018(5):2.
[15]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[16]HAN H,WANG W Y,MAO B H.Borderline-SMOTE:a new over-sampling method in imbalanced data sets learning[C]//International Conference on Intelligent Computing.Berlin:Sprin-ger,2005:878-887.
[17]DRUMMOND C,HOLTE R C.C4.5,class imbalance,and cost sensitivity:why under-sampling beats over-sampling[C]//Workshop on Learning from Imbalanced Datasets II.Washington DC:Citeseer,2003,11:1-8.
[18]SHEN L,LIN Z,HUANG Q.Relay backpropagation for effective learning of deep convolutional neural networks[C]//European Conference on Computer Vision.Cham:Springer,2016:467-482.
[19]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Ex- ploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:181-196.
[20]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[21]KHAN S H,HAYAT M,BENNAMOUN M,et al.Cost-sensitive learning of deep feature representations from imbalanced data[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(8):3573-3587.
[22]CAO K,WEI C,GAIDON A,et al.Learning imbalanced data- sets with label-distribution-aware margin loss[C]//Advances in Neural Information Processing Systems.2019:1567-1578.
[23]HUANG C,LI Y,CHEN C L,et al.Deep imbalanced learning for face recognition and attribute prediction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(11):2781-2794.
[24]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[25]SHU J,XIE Q,YI L,et al.Meta-weight-net:Learning an expli- cit mapping for sample weighting[C]//Advances in Neural Information Processing Systems.2019:1919-1930.
[26]REN M,ZENG W,YANG B,et al.Learning to Reweight Examples for Robust Deep Learning[C]//International Conference on Machine Learning.2018:4334-4343.
[27]KHAN S,HAYAT M,ZAMIR S W,et al.Striking the right balance with uncertainty[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2019:103-112.
[28]HAYAT M,KHAN S,ZAMIR W,et al.Max-margin class imbalanced learning with gaussian affinity[J].arXiv:1901.07711.
[29]CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1724-1734.
[30]MENON A K,JAYASUMANA S,RAWAT A S,et al.Long-tail learning via logit adjustment[J].arXiv:2007.07314.
[31]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751.
[32]JOHNSON R,ZHANG T.Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:562-570.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[1] 黄颖琦, 陈红梅. 基于代价敏感卷积神经网络的非平衡问题混合方法[J]. 计算机科学, 2021, 48(9): 77-85.
[2] 张晓宇, 王彬, 安卫超, 阎婷, 相洁. 基于融合损失函数的3D U-Net++脑胶质瘤分割网络[J]. 计算机科学, 2021, 48(9): 187-193.
[3] 冯姣, 陆昶谕. 基于残差注意力网络的跨媒体检索方法[J]. 计算机科学, 2021, 48(6A): 122-126.
[4] 段菲, 王慧敏, 张超. 面向数据表示的Cauchy非负矩阵分解[J]. 计算机科学, 2021, 48(6): 96-102.
[5] 石先让, 宋廷伦, 唐得志, 戴振泳. 一种新颖的单目视觉深度学习算法:H_SFPN[J]. 计算机科学, 2021, 48(4): 130-137.
[6] 穆逢君, 邱静, 陈路锋, 黄瑞, 周林, 于功敬. 面向人机协同的物体姿态估计帧间稳定性优化方法[J]. 计算机科学, 2021, 48(11): 226-233.
[7] 孟丽莎, 任坤, 范春奇, 黄泷. 基于密集卷积生成对抗网络的图像修复[J]. 计算机科学, 2020, 47(8): 202-207.
[8] 景雨, 祁瑞华, 刘建鑫, 刘朝霞. 基于改进多尺度深度卷积网络的手势识别算法[J]. 计算机科学, 2020, 47(6): 180-183.
[9] 王立华,杜明辉,梁亚玲. 基于角度特征的分类网络[J]. 计算机科学, 2020, 47(2): 83-87.
[10] 王丽星, 曹付元. 基于Huber损失的非负矩阵分解算法[J]. 计算机科学, 2020, 47(11): 80-87.
[11] 刘颖. 供应链金融大数据分布特征的分析与洞见[J]. 计算机科学, 2019, 46(2): 1-10.
[12] 鲁淑霞, 蔡莲香, 张罗幻. 基于零阶减小方差方法的鲁棒支持向量机[J]. 计算机科学, 2019, 46(11): 193-201.
[13] 陶秉墨,鲁淑霞. 基于自适应随机梯度下降方法的非平衡数据分类[J]. 计算机科学, 2018, 45(6A): 487-492.
[14] 沈夏炯, 张俊涛, 韩道军. 基于梯度提升回归树的短时交通流预测模型[J]. 计算机科学, 2018, 45(6): 222-227.
[15] 李明霞,刘保相,张春英. 三支决策空间下的区间参数优化模型及应用[J]. 计算机科学, 2017, 44(1): 84-89.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 卢照敢,许春梅,孙楠,苗许娜. 采用改进高斯核的MLS-SVM人脸表情识别算法[J]. 计算机科学, 2014, 41(Z6): 132 -134 .
[2] 谢忠红,郭小清,姬长英,朱淑鑫. 基于梯度相位编组的树枝识别新算法[J]. 计算机科学, 2012, 39(5): 254 -256 .
[3] 徐丽丽, 李洪, 李劲. 基于灰色预测和径向基网络的人口预测研究[J]. 计算机科学, 2019, 46(6A): 431 -435 .
[4] 田维维, 周悦, 尹旺, 何凌, 邓丽华, 李元媛. 结合EHHT和CI的精神分裂症语音自动检测算法[J]. 计算机科学, 2020, 47(6A): 187 -195 .
[5] 李忠发, 杨光, 马磊, 孙永奎. 变电站巡检机器人重定位研究[J]. 计算机科学, 2020, 47(6A): 599 -602 .
[6] 周文祥, 乔学工. 基于能量优化的无线传感器网络任播路由算法[J]. 计算机科学, 2020, 47(12): 291 -295 .
[7] 潘孝勤, 芦天亮, 杜彦辉, 仝鑫. 基于深度学习的语音合成与转换技术综述[J]. 计算机科学, 2021, 48(8): 200 -208 .
[8] 王俊, 王修来, 庞威, 赵鸿飞. 面向科技前瞻预测的大数据治理研究[J]. 计算机科学, 2021, 48(9): 36 -42 .
[9] 余力, 杜启翰, 岳博妍, 向君瑶, 徐冠宇, 冷友方. 基于强化学习的推荐研究综述[J]. 计算机科学, 2021, 48(10): 1 -18 .
[10] 王梓强, 胡晓光, 李晓筱, 杜卓群. 移动机器人全局路径规划算法综述[J]. 计算机科学, 2021, 48(10): 19 -29 .