计算机科学 ›› 2016, Vol. 43 ›› Issue (6): 208-213.doi: 10.11896/j.issn.1002-137X.2016.06.042
杨晓峰,严建峰,刘晓升,杨璐
YANG Xiao-feng, YAN Jian-feng, LIU Xiao-sheng and YANG Lu
摘要: 在电信运营商领域,离网预测模型是企业决策者用来发现潜在离网用户(即停用运营商服务)的主要手段。目前离网预测模型都是基于逻辑回归、决策树、神经网络及随机森林等浅层机器学习算法,但是在大数据的背景下,这些浅层算法在预测问题上很难取得更高的精度。因此,提出了一种新型的深层结构模型——深度随机森林,通过将传统浅层随机森林堆积成深层结构模型,获得更高的预测精度。在运营商真实数据上进行了大量实验,结果证明深层随机森林模型比传统浅层机器学习算法在离网预测问题上可以得到更好的效果。同时,增大训练数据量可以进一步提升深层随机森林的预测能力,从而证明了在大数据环境下深层模型的潜力。
[1] Hinton G E,Osindero S.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554 [2] LeCun Y,Jackel L,Bottou L,et al.Comparison of Learning Algorithms for Handwritten Digit Recognition[C]∥I nternational Conference on Artifical Neural Networks.1995:53-60 [3] Breiman L,Schapire E.Random forests[J].Machine Learning,2001,45(1):5-32 [4] Fang Kuang-nan,Wu Jian-bin,Zhu Jian-ping,et al.A Review of Technologies on Random Forests[J].Statistics and Information Forum,2011,26(3):32-38(in Chinese) 方匡南,吴见彬,朱建平,等.随机森林方法研究综述[J].统计与信息论坛,2011,26(3):32-38 [5] Davis J,Goadrich M.The Relationship Between Precision-Recall and ROC Curves[C]∥Proceedings of the 23rd International Conference on Machine Learning(ICML).2000:233-240 [6] Neslin S,Gupta S,Kamakura W A,et al.Defection Detection:Measuring and Understanding the Predictive Accuracy of Customer Churn Models[J].Social Science Electronic Publishing,2006,43(2):204-211 [7] Hadden J,Tiwari A,Roy R,et al.Computer assisted customer churn management:State-of-the-art and future trends[J].Computers & Operations Research,2007,34(10):2902-2917 [8] Lima E.Domain knowledge integration in data mining using decision tables:case studies in churn prediction[J].Journal of the Operational Research Society,2009,60(8):1096-1106 [9] Huang Yi-qing,Zhu Fang-zhou,Yuan Ming-xuan,et al.Telcochurn prediction with big data[C]∥SIGMOD.2015:607-618 [10] Yuan Ming-xuan,Deng Ke,Zeng Jia,et al.OceanST:A distributed analytic system for large-scale spatiotemporal mobile broadband data[C]∥VLDB (Demo).2014:1561-1564 [11] Verbeke W,Martens D,Mues C,et al.Building comprehensible customer churn prediction models with advanced rule induction techniques[J].Expert Systems with Applications,2011,38(3):2354-2364 [12] Sun Zhi-jun,Xue Lei,Xu Yang-ming,et al.Overview of deeplearning[J].Application Research of Computers,2012,29(8):2806-2810(in Chinese) 孙志军,薛磊,许阳明,等.深度学习研究综述[J].计算机应用研究,2012,29(8):2806-2810 [13] S Jin-bo,L Xiu,L Wen-huang.The Application ofAdaBoost in Customer Churn Prediction[C]∥2007 International Conference on Service Systems and Service Management.IEEE,2007:1-6 [14] Lemmens A,Croux C.Bagging and boosting classification trees to predict churn[J].Journal of Marketing Research,2006,43(2):276-286 [15] Datta P,Masand B R.Mani D,et al.Automated Cellular Mode-ling and Prediction on a Large Scale[J].Artificial Intelligence Review,2000,14(6):485-502 [16] Hung S,Yen D C,Wang H.Applying data mining to telecomchurn management[J].Expert Systems with Applications,2006,31:515-524 [17] Burez J,Van den Poel D.Handling class imbalance in customer churn prediction[J].Dirk Van den Poel,2008,36(3):4626-4636 [18] Lecun Y,Bottou L,Bengio Y,et al.Gradient-based learning applied to document recognition[C]∥Proceedings of the IEEE.1998 [19] Liu Jian-wei,Liu Yuan,Luo Xiong-lin.Research and develop-ment on deep learning[J].Application Research of Computers,2014,31(7):1921-1930(in Chinese) 刘建伟,刘媛,罗雄麟.深度学习研究进展[J].计算机应用研究,2014,31(7):1921-1930 [20] Page L,Brin S,Motwani R,et al.The PageRank Citation Ran-king:Bringing Order to the Web[C]∥Stanford InfoLab.1998:1-14 [21] Zhu X,Ghahramani Z.Learning from labeled and unlabeled data with label propagation[R].Technical Report CMU-CALD-02-107,Carnegie Mellon University,2002 [22] Rendle S.Scaling factorization machines to relational data[J].PVLDB,2013,6(5):337-348 [23] Zeng J,Cheung W K,Liu J.Learning topic models by beliefpropagation[J].IEEE Trans.Pattern Anal.Mach.Intell.,2013,35(5):1121-1134 [24] Xu Xiang-yang.Application of x2 Test in Analysing Students’ Score Difference[J].Journal of Changzhou Teachers College of Technology,2001,7(4):13-16(in Chinese) 徐向阳.卡方检验在学生成绩差异性分析中的应用[J].常州技术师范学院学报,2001,7(4):13-16 |
No related articles found! |
|