计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 7-13.doi: 10.11896/jsjkx.181102216
杨德杰1, 章宁1, 袁戟2, 白璐1
YANG De-jie1, ZHANG Ning1, YUAN Ji2, BAI Lu1
摘要: 个人信用历来是银行衡量个人履约风险最重要的因素。近年来,随着我国借贷需求与日俱增,仅依据信用卡信息的传统个人信用评估方式,已不能完全满足银行业的发展需求。因此,为了构建更加丰富的用户信用画像,文中基于银行大数据提取信用风险评估特征。为了解决金融大数据带来的维度灾难和噪声问题,充分考虑了数据特征之间的相关性,对堆栈降噪自编码神经网络模型进行了改进,引入了截断的Karhunen-Loève展开作为噪声传入项,并在某商业银行的大数据平台上进行了一系列数据实验。实验结果显示:相比仅使用信用卡信息,利用银行大数据能使衡量正负样本分离度的指标——K-S值提升约11%;改进的堆栈降噪自编码神经网络方法具有更好的风险评估效果,准确率相比原模型提高了3%左右,验证了在银行大数据环境下进行信用风险评估的有效性。
中图分类号:
[1]LESSMANN S,BAESENS B,SEOW H V,et al.Benchmarking State-of-theart Classification Algorithms for Credit Scoring:An Update of Research[J].European Journal of Operational Research,2015,247(1):124-136. [2]VISHWAKARMA A C,SOLANKI R.Analysing Credit Risk using Statistical and Machine Learning Techniques[J].International Journal of Engineering Science and Computing,2018,8(6):18397-18404. [3]JAYANTHI J,JOSEPH KS,VAISHNAVI J.Bankruptcy Prediction using SVM and Hybrid SVM Survey [J].International Journal of Computer Application,2011,33(7):39-45. [4]FANG K N,ZHANG G J,ZHANG H Y.Individual Credit Risk Prediction Method:Application of a Lasso-logistic Model [J].The Journal of Quantitative & Technical Economics,2014,31(2):125-136.(in Chinese) 方匡南,章贵军,张慧颖.基于Lasso-logistic模型的个人信用风险预警方法[J].数量经济技术经济研究,2014,31(2):125-136. [5]LIN W Y,HU Y H,TSAI C F.Machine Learning in Financial Crisis Prediction:A Survey[J].IEEE Transactions on Systems Man & Cybernetics Part C,2012,42(4):421-436. [6]CHEN M Y,CHEN C C,LIU J Y.Credit Rating Analysis with Support Vector Machines and Artificial Bee Colony Algorithm[C]//Recent Trends in Applied Artificial Intelligence.Amsterdam:Springer,2013:528-534. [7]HEATON J B,POLSON N G,WITTE J H.Deep Learning in Finance[J].Applied Stochastic Models in Business and Industry,2017,33(1):561-580. [8]YU L,YANG Z B,TANG L.A Novel Multistage Deep Belief Network Based Extreme Learning Machine Ensemble Learning Paradigm for Credit Risk Assessment[J].Flexible Services & Manufacturing Journal,2016,28(4):576-592. [9]SIRIGNANO J,SADHWANI A,GIESECKE K.Deep Learning for Mortgage Risk[J].Social Science Electronic Publishing,2017,22(6):134-216. [10]SHIGEYUKI H,MINAMI K,TAKAHIRO K,et al.Ensemble Learning or Deep Learning? Application to Default Risk Analysis[J].Risk and Financial Management,2018,11(1):12-25. [11]MA S L,WUNIRI Q G,LI X P.Deep Learning With Big Data:State of The Art and Development [J].CAAI Transactions on Intelligent Systems,2016,11(6):728-742.(in Chinese) 马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728-742. [12]LIU X H,DING W.Big Data Credit Reporting Practices of ZestFinance in The United States[J].Credit Reference,2015,22(8):27-32.(in Chinese) 刘新海,丁伟.美国ZestFinance公司大数据征信实践 [J].征信,2015,22(8):27-32. [13]LECUN Y,BENGIO Y,HINTON G.Deep Learning [J].Nature,2015,521(7553):436-444. [14]CUI L X,BAI L,HANCOCK E R,et al.Identifying the most informative features using a structurally interacting elastic net[J].Neurocomputing,2018,313(11):65-77. [15]ADDO P M,GUEGAN D,HASSANI B.Credit Risk Analysis Machine and Deep Learning Models[J].Risks,2018,6(2):38-57. [16]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507. [17]VINCENT P,LAROCHELLE H,LAJOIE I,et al.Stacked Denosing Autoencoders:Learning Useful Representations in a Deep Network with aLocal Denoising Criterion [J].Journal Machine Learning Research,2010,27(11):3371-3408. [18]SAGHA H,CUMMINS N,SCHULLER B.Stacked Denoising Autoencoders for Sentiment Analysis:A review[J].Data Mining and Knowledge Discovery,2017,7(5):132-146. [19]ALHASSAN Z,MCGOUGH A,ALSHAMMARI R,et al. Stacked Denoising Autoencoders for Mortality Risk Prediction Using Imbalanced Clinical Data[C]//IEEE International Conference on Machine Learning and Applications.Orlando:IEEE Press,2018:396-401. [20]VANMARCKE E H.Random Fields:Analysis and Synthesis [M].Cambridge:MIT Press,1983:92-101. [21]YUAN J.Time-dependent Probabilistic Assessment of Rainfall-induced Slope Failure[D].Munich:Technical University of Munich,2016. [22]BETZ W,PAPAIOANNOU I,STRAUB D.Numerical Methods for the Discretization of Random Fields by Means of the Karhunen-Loève Expansion[J].Computer Methods in Applied Mechanics and Engineering,2014,271(0):109-129. |
[1] | 叶雅珍, 刘国华, 朱扬勇. 数据产品流通的两阶段授权模式[J]. 计算机科学, 2021, 48(1): 119-124. |
[2] | 王瑞平, 贾真, 刘畅, 陈泽威, 李天瑞. 基于DeepFM的深度兴趣因子分解机网络[J]. 计算机科学, 2021, 48(1): 226-232. |
[3] | 于文家, 丁世飞. 基于自注意力机制的条件生成对抗网络[J]. 计算机科学, 2021, 48(1): 241-246. |
[4] | 仝鑫, 王斌君, 王润正, 潘孝勤. 面向自然语言处理的深度学习对抗样本综述[J]. 计算机科学, 2021, 48(1): 258-267. |
[5] | 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59. |
[6] | 赵会群, 吴凯锋. 一种大数据估价算法[J]. 计算机科学, 2020, 47(9): 110-116. |
[7] | 马梦宇, 吴烨, 陈荦, 伍江江, 李军, 景宁. 显示导向型的大规模地理矢量实时可视化技术[J]. 计算机科学, 2020, 47(9): 117-122. |
[8] | 何鑫, 许娟, 金莹莹. 行为关联网络:完整的变化行为建模[J]. 计算机科学, 2020, 47(9): 123-128. |
[9] | 叶亚男, 迟静, 于志平, 战玉丽, 张彩明. 基于改进CycleGan模型和区域分割的表情动画合成[J]. 计算机科学, 2020, 47(9): 142-149. |
[10] | 邓良, 许庚林, 李梦杰, 陈章进. 基于深度学习与多哈希相似度加权实现快速人脸识别[J]. 计算机科学, 2020, 47(9): 163-168. |
[11] | 暴雨轩, 芦天亮, 杜彦辉. 深度伪造视频检测技术综述[J]. 计算机科学, 2020, 47(9): 283-292. |
[12] | 董明刚, 黄宇扬, 敬超. 基于遗传实例和特征选择的K近邻训练集优化方法[J]. 计算机科学, 2020, 47(8): 178-184. |
[13] | 朝乐门. 数据科学导论的课程设计及教学改革[J]. 计算机科学, 2020, 47(7): 1-7. |
[14] | 袁野, 和晓歌, 朱定坤, 王富利, 谢浩然, 汪俊, 魏明强, 郭延文. 视觉图像显著性检测综述[J]. 计算机科学, 2020, 47(7): 84-91. |
[15] | 王文刀, 王润泽, 魏鑫磊, 漆云亮, 马义德. 基于堆叠式双向LSTM的心电图自动识别算法[J]. 计算机科学, 2020, 47(7): 118-124. |
|