Computer Science ›› 2021, Vol. 48 ›› Issue (12): 337-342.doi: 10.11896/jsjkx.201100212

• Artificial Intelligence • Previous Articles     Next Articles

Hierarchical Learning on Unbalanced Data for Predicting Cause of Action

QU Hao1, CUI Chao-ran2, WANG Xiao-xiao2, SU Ya-xi2, HAN Xiao-hui3, YIN Yi-long1   

  1. 1 School of Software,Shandong University,Jinan 250101,China
    2 College of Computer Science and Technology,Shandong University of Finance and Economics,Jinan 250014,China
    3 Shandong Computing Center(National Supercomputing Jinan Center),Jinan 250014,China
  • Received:2020-11-28 Revised:2021-04-14 Online:2021-12-15 Published:2021-11-26
  • About author:QU Hao,born in 1997,postgraduate.His main research interests include machine learning,natural language processing.
    YIN Yi-long,born in 1972,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include machine learning,data mining,and biometrics.
  • Supported by:
    National key R & D Plan Project(2018YFC0830100,2018YFC0830102).

Abstract: The cause of action represents the nature of the legal relationships involved in the case.A scientific and rational choice of the cause of action will facilitate the correct application of laws and enable the courts to perform classification management of cases.Cause of action prediction aims to endow computers with the ability to automatically predict the cause category based on the textual case description.Due to the small number of the samples of low-frequency categories and the difficulty of learning effective features,previous methods usually filters out the samples of low-frequency category in data preprocessing.However,in the problem of predicting the cause of action,the key challenge is how to make an accurate prediction for the cases of low-frequency cause categories.To solve this problem,in this paper,we propose a novel hierarchical learning method based on unbalanced samples for predicting cause of action.Firstly,all causes are divided into the first-level and second-level causes according to their inherent hierarchical structure.Then,the tailed ones in second-level causes can be merged into a new first-level category with sufficient samples,and the hierarchical learning is applied to realize the prediction of cause of action.Finally,we refine the loss function to alleviate the problem of data imbalance.Experimental results show that the proposed method is significantly superior over the baseline methods,leading to an improvement of 4.81% in terms of accuracy.Also,we verify the benefits of introducing the hierarchical learning as well as refining the loss function for unbalanced data.

Key words: Cause of action prediction, Hierarchical learning, Loss function, Unbalanced data

CLC Number: 

  • TP391
[1]CHEN H,CAI D,DAI W,et al.Charge-Based Prison Term Prediction with Deep Gating Network[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:6363-6368.
[2]JIANG X,YE H,LUO Z,et al.Interpretable rationale augmented charge prediction system[C]//Proceedings of the 27th International Conference on Computational Linguistics:System Demonstrations.2018:146-151.
[3]LIU C L,HSIEH C D.Exploring phrase-based classification of judicial documents for criminal charges in chinese[C]//International Symposium on Methodologies for Intelligent Systems.Berlin:Springer,2006:681-690.
[4]LUO B,FENG Y,XU J,et al.Learning to Predict Charges for Criminal Cases with Legal Basis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Proces-sing.2017:2727-2736.
[5]ZHONG H,GUO Z,TU C,et al.Legal judgment prediction via topological learning[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3540-3549.
[6]LIU C L,CHANG C T,HO J H.Case Instance Generation and Refinement for Case-Based Criminal Summary Judgments in Chinese[J].Journal of Information Science and Engineering,2004,20(4):783-800.
[7]YANG W,JIA W,ZHOU X,et al.Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network[C]//Twenty-Eighth InternationalJoint Conference on Artificial Intelligence IJCAI-19.2019.
[8]XU N,WANG P,CHEN L,et al.Distinguish Confusing Law Articles for Legal Judgment Prediction[J].arXiv:2004.02557.
[9]HU Z,LI X,TU C,et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:487-498.
[10]YE H,JIANG X,LUO Z,et al.Interpretable charge predictions for criminal cases:Learning to generate court views from fact descriptions[J].arXiv:1802.08504.
[11]KATZ D M,BOMMARITO M J,BLACKMAN J.A general approach for predicting the behavior of the Supreme Court of the United States[J].PloS One,2017,12(4):e0174698.
[12]KENDALL M G,STUART A,ORD J K.Kendall's advanced theory of statistics[M].Oxford University Press,Inc.,1987.
[13]VAN HORN G,PERONA P.The devil is in the tails:Fine-grained classification in the wild[J].arXiv:1709.01450.
[14]HU Q H,WANG Y,ZHOU Y C,et al.A Review of Hierarchical Learning Methods for Large-scale Categorization Tasks[J].Scientia Sinica Informationis,2018(5):2.
[15]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[16]HAN H,WANG W Y,MAO B H.Borderline-SMOTE:a new over-sampling method in imbalanced data sets learning[C]//International Conference on Intelligent Computing.Berlin:Sprin-ger,2005:878-887.
[17]DRUMMOND C,HOLTE R C.C4.5,class imbalance,and cost sensitivity:why under-sampling beats over-sampling[C]//Workshop on Learning from Imbalanced Datasets II.Washington DC:Citeseer,2003,11:1-8.
[18]SHEN L,LIN Z,HUANG Q.Relay backpropagation for effective learning of deep convolutional neural networks[C]//European Conference on Computer Vision.Cham:Springer,2016:467-482.
[19]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Ex- ploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:181-196.
[20]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[21]KHAN S H,HAYAT M,BENNAMOUN M,et al.Cost-sensitive learning of deep feature representations from imbalanced data[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(8):3573-3587.
[22]CAO K,WEI C,GAIDON A,et al.Learning imbalanced data- sets with label-distribution-aware margin loss[C]//Advances in Neural Information Processing Systems.2019:1567-1578.
[23]HUANG C,LI Y,CHEN C L,et al.Deep imbalanced learning for face recognition and attribute prediction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(11):2781-2794.
[24]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[25]SHU J,XIE Q,YI L,et al.Meta-weight-net:Learning an expli- cit mapping for sample weighting[C]//Advances in Neural Information Processing Systems.2019:1919-1930.
[26]REN M,ZENG W,YANG B,et al.Learning to Reweight Examples for Robust Deep Learning[C]//International Conference on Machine Learning.2018:4334-4343.
[27]KHAN S,HAYAT M,ZAMIR S W,et al.Striking the right balance with uncertainty[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2019:103-112.
[28]HAYAT M,KHAN S,ZAMIR W,et al.Max-margin class imbalanced learning with gaussian affinity[J].arXiv:1901.07711.
[29]CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1724-1734.
[30]MENON A K,JAYASUMANA S,RAWAT A S,et al.Long-tail learning via logit adjustment[J].arXiv:2007.07314.
[31]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751.
[32]JOHNSON R,ZHANG T.Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:562-570.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[1] MENG Yue-bo, MU Si-rong, LIU Guang-hui, XU Sheng-jun, HAN Jiu-qiang. Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism [J]. Computer Science, 2022, 49(7): 142-147.
[2] GAO Rong-hua, BAI Qiang, WANG Rong, WU Hua-rui, SUN Xiang. Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism [J]. Computer Science, 2022, 49(6A): 363-369.
[3] HUANG Ying-qi, CHEN Hong-mei. Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification [J]. Computer Science, 2021, 48(9): 77-85.
[4] ZHANG Xiao-yu, WANG Bin, AN Wei-chao, YAN Ting, XIANG Jie. Glioma Segmentation Network Based on 3D U-Net++ with Fusion Loss Function [J]. Computer Science, 2021, 48(9): 187-193.
[5] FENG Jiao, LU Chang-yu. Cross Media Retrieval Method Based on Residual Attention Network [J]. Computer Science, 2021, 48(6A): 122-126.
[6] SHI Xian-rang, SONG Ting-lun, TANG De-zhi, DAI Zhen-yong. Novel Deep Learning Algorithm for Monocular Vision:H_SFPN [J]. Computer Science, 2021, 48(4): 130-137.
[7] WANG Xiao-xiao, WANG Ting-wen, MA Yu-ling, FAN Jia-yi, CUI Chao-ran. Credit Risk Assessment Method of P2P Online Loan Borrowers Based on Deep Forest [J]. Computer Science, 2021, 48(11A): 429-434.
[8] MU Feng-jun, QIU Jing, CHEN Lu-feng, HUANG Rui, ZHOU Lin, YU Gong-jing. Optimization Method for Inter-frame Stability of Object Pose Estimation for Human-Machine Collaboration [J]. Computer Science, 2021, 48(11): 226-233.
[9] MENG Li-sha, REN Kun, FAN Chun-qi, HUANG Long. Dense Convolution Generative Adversarial Networks Based Image Inpainting [J]. Computer Science, 2020, 47(8): 202-207.
[10] JING Yu, QI Rui-hua, LIU Jian-xin, LIU Zhao-xia. Gesture Recognition Algorithm Based on Improved Multiscale Deep Convolutional Neural Network [J]. Computer Science, 2020, 47(6): 180-183.
[11] WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[12] WANG Li-xing, CAO Fu-yuan. Huber Loss Based Nonnegative Matrix Factorization Algorithm [J]. Computer Science, 2020, 47(11): 80-87.
[13] LU Shu-xia, CAI Lian-xiang, ZHANG Luo-huan. Robust SVM Based on Zeroth Order Variance Reduction [J]. Computer Science, 2019, 46(11): 193-201.
[14] TAO Bing-mo,LU Shu-xia. Adaptive Stochastic Gradient Descent for Imbalanced Data Classification [J]. Computer Science, 2018, 45(6A): 487-492.
[15] SHEN Xia-jiong, ZHANG Jun-tao, HAN Dao-jun. Short-term Traffic Flow Prediction Model Based on Gradient Boosting Regression Tree [J]. Computer Science, 2018, 45(6): 222-227.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!