计算机科学 ›› 2016, Vol. 43 ›› Issue (Z11): 557-563.doi: 10.11896/j.issn.1002-137X.2016.11A.126

• 智能系统及应用 • 上一篇    下一篇

基于随机森林模型的电信运营商外呼推荐系统

朱奕健,张正卿,黄一清,白瑞瑞,严建峰   

  1. 中国联合网络通信有限公司上海市分公司 上海200050,中国联合网络通信有限公司上海市分公司 上海200050,苏州大学计算机科学与技术学院 苏州215006,苏州大学计算机科学与技术学院 苏州215006,苏州大学计算机科学与技术学院 苏州215006
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受江苏省科技支撑计划重点项目(BE2014005),国家自然科学基金(61572339)资助

Random Forest Based Telco Out-calling Recommendation System

ZHU Yi-jian, ZHANG Zheng-qing, HUANG Yi-qing, BAI Rui-rui and YAN Jian-feng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 在电信运营商领域,外呼推荐是一种重要的推荐产品和服务的途径。实现了一种基于运营商大数据的自动外呼推荐系统,该系统能够挖掘用户的行为特征并且使用机器学习的方法预测用户对于被推荐产品的接受可能性。传统推荐系统使用的模型算法为矩阵分解、大规模稀疏特征分类、神经网络等。采用随机森林算法的主要原因是随机森林具有并行化程度高、训练速度快、生成的决策树可解释等诸多优点,适合于基于电信业数据的推荐系统。该外呼推荐系统基于Hadoop、Impala和Spark等大数据处理平台及工具,使用随机森林分类器作为核心算法,将用户最近的行为特征回归为接受外呼推荐产品的可能性。在线测试表明使用该系统与当前部署的人工随机外呼相比,能够提升约41%的用户接受率;同时,根据模型算法输出特征的重要性,进一步给出了两类用户的特征分析。

关键词: 外呼,推荐系统,随机森林,电信运营商

Abstract: Out-calling recommendation is widely used in recommending products and services to customers by telecommunication (telco) operators.In this paper,we developed an automatic out-calling recommendation system which relies on telco big data.This system uses data-mining methods to extract customer behaviors and machine-learning algorithm to predict the acceptance probabilities when customers are recommended to certain products.Different from most recommendation systems which use matrix factorization (MF),sparse features classification,neural network and etc,this paper used random forest,not only because the algorithm is easy to be parallel implemented and has fast training speed,but also the rules from the resulting decision trees are easy to explain.These characteristics make the random forest suitable for telco recommendation system.Our system is implemented on the top of Hadoop,Impala and Spark.Random forest is used as the core algorithm to calculate the acceptance probability when a user is recommended to a product based on user behavior features.Online testing shows that the proposed system can achieve 41% improvement compared with the current deployed random out-calling recommendation method.We also gave the customer behavior analysis according to the feature importance from the outputs of random forest.

Key words: Out-calling,Recommend system,Random forest,Telco operator

[1] Sarwar B,Karypis G,Konstan J,et al.Application of dimensionality reduction in recommender system-a case study[R].Minnesota Univ Minneapolis Dept of Computer Science,2000
[2] Koren Y,Bell R,Volinsky C.Matrix factorization techniques for recommender systems[J].Computer,2009 (8):30-37
[3] Huang Yi-qing,Zhu Fang-zhou,Yuan Ming-xuan,et al.Telcochurn prediction with big data[M].SIGMOD,2015
[4] Yuan Ming-xuan,Deng Ke,Zeng Jia,et al.OceanST:A distributed analytic system for large-scale spatiotemporal mobile broadband data[C]∥VLDB (Demo).2014:1561-1564
[5] Page L,Brin S,Motwani R,et al.The PageRank Citation Ranking:Bringing Order to the Web[R].Stanford InfoLab,1999
[6] Zhu X,Ghahramani Z.Learning from labeled andunlabeled data with label propagation[R].Technical Report CMU-CALD-02-107,Carnegie MellonUniversity,2002
[7] Zeng J,Cheung W K,Liu J.Learning topic modelsby belief propagation[J].IEEE Trans.Pattern Anal.Mach.Intell.,2013,35(5):1121-1134
[8] Rendle S.Scaling factorization machines to relational data[C]∥PVLDB.2013:337-348
[9] Neslin S,Gupta S,Kamakura W A,et al.Defection Detection:Measuring and Understanding the Predictive Accuracy of Customer Churn Models[J].Social Science Electronic Publishing,2006,43(2):204-211
[10] Hadden J,Tiwari A,Roy R,et al.Computer assisted customerchurn management:State-of-the-art and future trends[J].Computers & Operations Research,2007,34(10):2902-2917
[11] Lima E.Domain knowledge integration in data mining using decision tables:case studies in churn prediction[J].Journal of the Operational Research Society,2009,60(8):1096-1106(11)
[12] Verbeke W,Martens D,Mues C,et al.Building comprehensible customer churn prediction models with advanced rule induction techniques.[J].Expert Systems with Applications,2011,38(3):2354-2364
[13] Jinbo S,Xiu L,Wenhuang L.The Application ofAdaBoost inCustomer Churn Prediction[C]∥2007 International Conference on Service Systems and Service Management.IEEE,2007:1-6
[14] Lemmens A,Croux C.Bagging and boosting classification trees to predict churn[J].Journal of Marketing Research,2006,43(2):276-286
[15] Datta P,Masand B R,Mani D,et al.Automated Cellular Modeling and Prediction on a Large Scale[J].Artificial Intelligence Review,2000,14(6):485-502
[16] Hung S,Yen D C,Wang H.Applying data mining to telecomchurn management.[J].Expert Systems with Applications,2006,31:515-524
[17] Burez J,Van den Poel D.Handling class imbalance in customer churn prediction[J].Dirk Van den Poel,2008,36(3):4626-4636
[18] Davis J,Goadrich M.The Relationship Between Precision-Recall and ROC Curves[C]∥ICML ’06:Proceedings of the 23rd International Conference on Machine Learning.2006
[19] 方匡南,吴见彬,朱建平,等.随机森林方法研究综述[J].统计与信息论坛,2011,26(3):32-38

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!