计算机科学 ›› 2017, Vol. 44 ›› Issue (2): 283-289.doi: 10.11896/j.issn.1002-137X.2017.02.048

• 人工智能 • 上一篇    下一篇

基于用户相似度和特征分化的广告点击率预测研究

潘书敏,颜娜,谢瑾奎   

  1. 华东师范大学计算机科学技术系 上海200241,华东师范大学计算机科学技术系 上海200241,华东师范大学计算机科学技术系 上海200241
  • 出版日期:2018-11-13 发布日期:2018-11-13

Study on Advertising Click-through Rate Prediction Based on User Similarity and Feature Differentiation

PAN Shu-min, YAN Na and XIE Jin-kui   

  • Online:2018-11-13 Published:2018-11-13

摘要: 大数据环境下如何对互联网广告进行精准投放一直是计算广告学领域高度关注的问题。作为在线广告投放效果的一个重要指标,点击率的精确预测关系到媒体、用户和广告主三方的利益。目前的主流方法是通过抽取特征建立单一点击率预测模型,其不足之处在于使用单个权重来度量特征对点击率的影响过于片面。该研究基于分而治之的思想,提出了基于用户相似度和特征分化的混成模型。该模型首先根据混合高斯分布来评估用户相似度,将其划分为多个群体。针对不同群体,分别构建子模型并进行有效组合,从而挖掘同一特征对不同群体的差异化影响,进而准确地预测广告点击行为。通过使用真实互联网公司的广告数据集进行实验,并与主流方法做了详细的对比分析,检验了该方法的有效性。

关键词: 计算广告学,点击率预测,用户相似度,特征分化,混成模型

Abstract: Targeting the Internet advertising accurately is an eye-catching problem in the field of computational advertising.As an important evaluation criteria for online advertising effect,the precision of prediction for click through rate (CTR)benefits publishers,advertisers and users.Without considering feature differentiation,mainstream approaches are extracting features and establishing click prediction model,which use a single weight to measure the effect of a feature for CTR.According to the idea divide and conquer,a hybrid model based on user similarity and feature differentiation was proposed.The model divides users into several groups depending on user similarity evaluated by mixture gaussian distribution.For each group,model was built respectively and they were combined to excavate the different effects of a feature to different groups and improve predict CTR prediction accuracy.Several experiments on advertising data sets of an Internet companies were made and the effectiveness of the approach through detailed comparative analysis was verified with the mainstream approaches.

Key words: Computational advertising,CTR prediction,User similarity,Feature differentiation,Hybrid model

[1] BRODER A Z.Computational advertising[C]∥Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms,SODA.San Francisco,California,USA,2008.
[2] ZHOU A Y,ZHOU M Q,GONG X Q.Computational Adverti-sing:A Data-Centric Comprehensive Web Application [J].Chinese Journal of Computers,2011,34(10):1805-1819.(in Chinese) 周傲英,周敏奇,宫学庆.计算广告:以数据为核心的Web综合应用[J].计算机学报,2011,34(10):1805-1819.
[3] JI W D,WANG X L,ZHOU A Y.Techniques for estimating click-through rates of Web advertisements:A survey[J].Journal of East China Normal University(Natural Sciences),2013(3):2-14.(in Chinses) 纪文迪,王晓玲,周傲英.广告点击率估算技术综述[J].华东师范大学学报(自然科学版),2013(3):2-14.
[4] ANDERSON C.The Long Tail:Why the Future of Business Is Selling Less of More[M].Hyperion,2006.
[5] YUAN Y,WANG F,LI J,et al.A survey on real time bidding advertising[C]∥2014 IEEE International Conference on Service Operations and Logistics,and Informatics (SOLI).IEEE,2014:418-423.
[6] RICHARDSON M,DOMINOWSKA E,RAGNO R.Predictingclicks:estimating the click-through rate for new ads[C]∥Proceedings of the 16th International Conference on World Wide Web.ACM,2007:521-530.
[7] LEE K C,et al.Estimating conversion rate in display adverti-sing from past performance data[C]∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.ACM,2012.
[8] WANG X,LI W,CUI Y,et al.Click-through rate estimation for rare events in online advertising [J].Online Multimedia Advertising:Techniques and Technologies,2011,10:1-12.
[9] AGARWAL D,BRODER A Z,CHAKRABARTI D,et al.Estimating rates of rare events at multiple resolutions[C]∥Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Jose,California,USA,2007:16-25.
[10] AGARWAL D,CHEN B C,ELANGO P.Spatio-temporal mo-dels forestimating click-through rate[C]∥Proceedings of the 18th International Conference on World Wide Web.Madrid Spain,2009:21-30.
[11] YU K,JIA L,CHEN Y,et al.Deep Learning:Yesterday,To-day,and Tomorrow[J].Journal of Computer Research & Deve-lopment,2013,50(9):1799-1804.
[12] ZHANG Z Q,ZHOU Y,XIE X Q,et al.Research on Adverti-sing Click-Through Rate Estimation Based on Feature Learning[J].Chinese Journal of Computers,2016,39(4):780-794 .(in Chinese) 张志强,周永,谢晓芹,等.基于特征学习的广告点击率预估技术研究[J].计算机学报,2016,39(4):780-794.
[13] STAUFFER C,GRIMSON W E L.Adaptive Background Mix-ture Models for Real-Time Tracking[C]∥ Cvpr.IEEE Compu-ter Society,1999 .
[14] REDNER R A.Maximum likelihood estimation for mixturemodels[J].Annals of Mathematical Statistics,1980,22(3):583-590.
[15] BISHOP C M.Pattern Recognition and Machine Learning (Information Science and Statistics)[M]∥Springer-Verlag New York.2006.
[16] HASTIE T,TIBSHIRANI R,FRIEDMAN J,et al.The ele-ments of statistical learning [M].New York:Springer,2009.
[17] ZHANG W,YUAN S,WANG J,et al.Real-Time BiddingBenchmarking with iPinYou Dataset[J/OL].https://arxiv.org/abs/1407/7073.
[18] HAN J,KAMBER M.Data Mining:Concepts and Techniques(Second Edition)[M].San Francisco,2006:1-25.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!