Computer Science ›› 2024, Vol. 51 ›› Issue (8): 45-55.doi: 10.11896/jsjkx.230900107

• Database & Big Data & Data Science • Previous Articles     Next Articles

Interpretable Credit Evaluation Model for Delayed Label Scenarios

XIN Bo, DING Zhijun   

  1. Key Laboratory of Embedded System and Service Computing of Ministry of Education(Tongji University),Shanghai 201804,China
    Shanghai Network Finance Security Collaborative Innovation Center(Tongji University),Shanghai 201804,China
  • Received:2023-09-19 Revised:2023-12-13 Online:2024-08-15 Published:2024-08-13
  • About author:XIN Bo,born in 2000,postgraduate.His main research interests include credit evaluation and machine learning.
    DING Zhijun,born in 1974,Ph.D,Professor,Ph.D supervisor,is a senior member of CCF(No.14797S).His main research interests include intelligent software engineering,cloud computing and services,big data credit reporting and financial risk control.

Abstract: With the rapid development of social economy,credit business plays an increasingly important role in the financial field,and using machine learning algorithms for credit evaluation has become the mainstream method.However,there are still some problems to be solved,such as the inadequacy of labeled data and model lag caused by delayed labels,and the lack of interpretability in dynamic credit evaluation models.To address these problems,this paper proposes an interpretable credit evaluation model for delayed label scenarios.Built upon the foundation of dynamic model trees,the model incorporates weighted enhancements.It combines delayed label update algorithms and a pseudo-label selection strategy with adaptive thresholds,treating delayed label data as both feedback data and pseudo-label data,effectively mitigating the impacts of insufficient labeled data and model lag.Moreover,the model achieves interpretability.It is finally tested on some synthetic and real credit evaluation datasets,demonstrating superior balance between predictive performance and interpretability compared to other mainstream algorithms.

Key words: Credit evaluation, Delayed label, Interpretability, Dynamic model tree, Pseudo-label selection

CLC Number: 

  • TP3-05
[1]BASTANI K,ASGARI E,NAMAVARI H.Wide and deeplearning for peer-to-peer lending[J].Expert Systems with Applications,2019,134:209-224.
[2]LESSMANN S,BAESENS B,SEOW H V,et al.Benchmarking state-of-the-art classification algorithms for credit scoring:An update of research[J].European Journal of Operational Research,2015,247(1):124-136.
[3]GOMES H M,GRZENDA M,MELLO R F D,et al.A Survey on Semi-supervised Learning for Delayed Partially Labelled Data Streams[J].ACM Computing Surveys,2022,55(4):1-42.
[4]TAN F,HOU X,ZHANG J,et al.A Deep Learning Approach to Competing Risks Representation in Peer-to-Peer Lending[J].IEEE transactions on neural networks and learning systems,2018,30(5):1565-1574.
[5]DU M,LIU N,HU X.Techniques for interpretable machinelearning[J].Communications of the ACM,2019,63(1):68-77.
[6]JIAO L,YANG H,LIU Z G,et al.Interpretable fuzzy clustering using unsupervised fuzzy decision trees[J].Information Sciences,2022,611:540-563.
[7]LIU H,ZHOU Y,LIU B,et al.Incremental learning with neural networks for computer vision:a survey[J].Artificial Intelligence Review,2023,56(5):4557-4589.
[8]YU Z,WANG D,ZHAO Z,et al.Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification[J].IEEE transactions on cybernetics,2017,49(2):403-416.
[9]DYER K B,CAPO R,POLIKAR R.COMPOSE:A Semisupervised Learning Framework for Initially Labeled Nonstationary Streaming Data[J].IEEE transactions on neural networks and learning systems,2013,25(1):12-26.
[10]GAO H,DING Z.A Novel Machine Learning Method for De-layed Labels [C]//2022 IEEE International Conference on Networking,Sensing and Control(ICNSC).IEEE,2022:1-6.
[11]KUNCHEVA L I,SáNCHEZ J S.Nearest Neighbour Classifiers for Streaming Data with Delayed Labelling[C]//2008 Eighth IEEE International Conference on Data Mining.IEEE,2008:869-874.
[12]GAO H,DING Z,PAN M.Incremental Learning Method for Data with Delayed Labels[J].Computing and Informatics,2022,41(5):1260-1283.
[13]POZZOLO A D,BORACCHI G,CAELEN O,et al.Credit Card Fraud Detection:A Realistic Modeling and a Novel Learning Strategy[J].IEEE transactions on neural networks and learning systems,2017,29(8):3784-3797.
[14]DAS M,PRATAMA M,ZHANG J,et al.A Skip-ConnectedEvolving Recurrent Neural Network for Data Stream Classification under Label Latency Scenario[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2020:3717-3724.
[15]GUNNARSSON B R,BROUCKE S V,BAESENS B,et al.Deep learning for credit scoring:Do or don’t?[J].European Journal of Operational Research,2021,295(1):292-305.
[16]RIBEIRO M T,SINGH S,GUESTRIN C.“Why Should I Trust You?”[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:1135-1144.
[17]LUNDBERG S M,ERION G G,CHEN H,et al.From local explanations to global understanding with explainable AI for trees[J].Nature Machine Intelligence,2020,2(1):56-67.
[18]DONG L A,YE X,YANG G.Two-stage rule extraction method based on tree ensemble model for interpretable loan evaluation[J].Information Sciences,2021,573:46-64.
[19]ALANGARI N,MENAI M E,MATHKOUR H,et al.Intrinsically Interpretable Gaussian Mixture Model[J].Information,2023,14(3):164.
[20]DOMINGOS P,HULTEN G.Mining high-speed data streams[C]//Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2000:71-80.
[21]POTTS D,SAMMUT C.Incremental Learning of Linear Model Trees[J].Machine Learning,2005,61(1/2/3):5-48.
[22]HAUG J,BROELEMANN K,KASNECI G.Dynamic ModelTree for Interpretable Data Stream Learning[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:2562-2574.
[23]BROELEMANN K,KASNECI G.A Gradient-Based Split Criterion for Highly Accurate and Transparent Model Trees[C]//Proceedings of the Twenty-Eighth International Joint Confe-rence on Artificial Intelligence.IJCAI,2019:2030-7.
[24]GRZENDA M,GOMES H M,BIFET A.Delayed labelling eva-luation for data streams[J].Data Mining and Knowledge Disco-very,2020,34(5):1237-1266.
[25]STREET W N,KIM Y.A streaming ensemble algorithm(SEA) for large-scale classification[C]//Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.ACM,2001:377-382.
[26]HULTEN G,SPENCER L,DOMINGOS P.Mining time-chan-ging data streams[C]//Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2001:97-106.
[27]AGRAWAL R,IMIELINSKI T,SWAMI A N.Database Mi-ning:A Performance Perspective [J].IEEE Transactions on Knowledge and Data Engineering,1993,5(6):914-925.
[28]IKONOMOVSKA E,GAMA J,DZEROSKI S.Learning model trees from evolving data streams[J].Data Mining and Know-ledge Discovery,2011,23:128-168.
[29]MANAPRAGADA C,WEBB G I,SALEHI M.Extremely Fast Decision Tree [C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mi-ning.ACM,2018:1953-1962.
[30]GOMES H M,BIFET A,READ J,et al.Adaptive random fo-rests for evolving data stream classification[J].Machine Lear-ning,2017,106:1469-1495.
[1] QIAO Fan, WANG Peng, WANG Wei. Multivariate Time Series Classification Algorithm Based on Heterogeneous Feature Fusion [J]. Computer Science, 2024, 51(2): 36-46.
[2] HUANG Yuhang, SONG You, WANG Baohui. Improved Forest Optimization Feature Selection Algorithm for Credit Evaluation [J]. Computer Science, 2023, 50(6A): 220600241-6.
[3] WANG Dongli, YANG Shan, OUYANG Wanli, LI Baopu, ZHOU Yan. Explainability of Artificial Intelligence:Development and Application [J]. Computer Science, 2023, 50(6A): 220600212-7.
[4] YANG Bin, LIANG Jing, ZHOU Jiawei, ZHAO Mengci. Study on Interpretable Click-Through Rate Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(5): 12-20.
[5] CHEN Chong, CHEN Jie, ZHANG Hui, CAI Lei, XUE Yaru. Review on Interpretability of Deep Learning [J]. Computer Science, 2023, 50(5): 52-63.
[6] LI Xiang, FAN Zhiguang, LI Xuexiang, ZHANG Weixing, YANG Cong, CAO Yangjie. Survey of Visual Question Answering Based on Deep Learning [J]. Computer Science, 2023, 50(5): 177-188.
[7] DONG Yongfeng, HUANG Gang, XUE Wanruo, LI Linhao. Graph Attention Deep Knowledge Tracing Model Integrated with IRT [J]. Computer Science, 2023, 50(3): 173-180.
[8] LI Weizhuo, LU Bingjie, YANG Junming, NA Chongning. Study on Abductive Analysis of Auto Insurance Fraud Based on Network Representation Learning [J]. Computer Science, 2023, 50(2): 300-309.
[9] WANG Shaojiang, LIU Jia, ZHENG Feng, PAN Yicheng. Survey on Hierarchical Clustering for Machine Learning [J]. Computer Science, 2023, 50(1): 9-17.
[10] CHEN Yijun, GAO Haoran, DING Zhijun. Credit Evaluation Model Based on Dynamic Machine Learning [J]. Computer Science, 2023, 50(1): 59-68.
[11] WANG Ming, WU Wen-fang, WANG Da-ling, FENG Shi, ZHANG Yi-fei. Generative Link Tree:A Counterfactual Explanation Generation Approach with High Data Fidelity [J]. Computer Science, 2022, 49(9): 33-40.
[12] ZHAO Lu, YUAN Li-ming, HAO Kun. Review of Multi-instance Learning Algorithms [J]. Computer Science, 2022, 49(6A): 93-99.
[13] CHENG Ke-yang, WANG Ning, CUI Hong-gang, ZHAN Yong-zhao. Interpretability Optimization Method Based on Mutual Transfer of Local Attention Map [J]. Computer Science, 2022, 49(5): 64-70.
[14] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[15] CHEN Zhi-yu, SHAN Zhi-long. Research Advances in Knowledge Tracing [J]. Computer Science, 2022, 49(10): 83-95.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!