计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 12-20.doi: 10.11896/jsjkx.221000032
杨斌1, 梁婧2, 周佳薇2, 赵梦赐3
YANG Bin1, LIANG Jing2, ZHOU Jiawei2, ZHAO Mengci3
摘要: 在推荐系统研发中,点击率(Click-Through Rate,CTR)预估是非常重要的工作,点击率预估精度的提升直接影响到整个推荐系统的收益,对其性能和解释性的研究有助于理解系统决策的机理,同时还能帮助优化需求和系统设计。当前点击率预估深度模型多基于线性特征交互和深度特征提取进行设计。由于深度模型的黑盒特点,该类模型在解释性方面存在局限性,并且在先前的研究中,对点击率预估模型的解释性研究非常少。因此,文中基于多头自注意力机制,对该类模型的解释性进行研究,通过多头注意力机制对特征嵌入、线性特征交互和深度部分进行增强和解释,在深度部分设计了两种模型,即注意力增强的深度神经网络和注意力叠加的深度模型,通过计算每个模块的注意力得分对其进行解释。所提方法在多个真实数据集上进行了大量实验,结果表明所提方法能够有效提升模型效果,并且模型自身带有一定的解释性。
中图分类号:
[1]XIANG L.Recommender system practice[M].People PostPress,2012. [2]WANG Z.Deep Learning Recommender System[M].Electronic Industry Press,2020. [3]GUO H,TANG R,YE Y,et al.DeepFM:a factorization-machine based neural network for CTR prediction[J].arXiv:1703.04247,2017. [4]LIU M J,ZENG G C,YUE W,et al.Review on click-through rate prediction models for display advertising[J].Computer Science,2019,46(7):38-49. [5]CHEN T,GUESTRIN C.Xgboost:A scalable tree boosting system[C]//Proceedings of the 22ndACM Sigkdd International Conference on Knowledge Discovery and Data Mining.2016:785-794. [6]HE X J,PAN W J,CHENG H.An advertisement click-through rate prediction model based on ensemble learning[J].Computing Engineering & Science,2019,41(12):2278-2284. [7]RENDLE S.Factorization machines[C]//2010 IEEE Interna-tional Conference on Data Mining.IEEE,2010:995-1000. [8]JUAN Y,ZHUANG Y,CHIN W S,et al.Field-aware factorization machines for CTR prediction[C]//Proceedings of the 10th ACM Conference on Recommender Systems.2016:43-50. [9]HE X,PAN J,JIN O,et al.Practical lessons from predicting clicks on ads at facebook[C]//Proceedings of the Eighth International Workshop on Data Mining for Online Advertising.2014:1-9. [10]ZHANG W,DU T,WANG J.Deep learning over multi-field ca-tegorical data[C]//European Conference on Information Retrie-val.Cham:Springer,2016:45-57. [11]CHENG H T,KOC L,HARMSEN J,et al.Wide & deep lear-ning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems.2016:7-10. [12]WANG R,FU B,FU G,et al.Deep & cross network for ad click predictions[M]//Proceedings of the ADKDD' 17.2017:1-7. [13]LIAN J,ZHOU X,ZHANG F,et al.xdeepfm:Combining expli-cit and implicit feature interactions for recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1754-1763. [14]NING T,MIAO D Z,DONG Q W,et al.Wide and deep learning for default risk prediction[J].Computer Science,2021,48(5):197-201. [15]DING Y,LEI X,LIAO B,et al.MLRDFM:a multi-view Laplacian regularized DeepFM model for predicting miRNA-disease associations[J].Briefings in Bioinformatics,2022,23(3):bbac079. [16]CAO B Q,XIAO Q X,ZHANG X P,et al.An API service re-commendation method via combining self-organization map-based functionality clustering and deep factorization machine-based quality prediction[J].Chinese Journal of Computers,2019,42(6):1367-1383. [17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems,2017:6000-6010. [18]HAN K,WANG Y,CHEN H,et al.A survey on vision transformer[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(1):87-110. [19]XIAO J,YE H,HE X,et al.Attentional factorization machines:Learning the weight of feature interactions via attention networks[C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.2017:3119-3125. [20]SONG W,SHI C,XIAO Z,et al.AutoInt:Automatic feature interaction learning via self-attentive neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:1161-1170. [21]HUANG T,ZHANG Z,ZHANG J.FiBiNET:combining feature importance and bilinear feature interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems.2019:169-177. [22]YU R,YE Y,LIU Q,et al.Xcrossnet:Feature structure-oriented learning for click-through rate prediction[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Cham:Springer,2021:436-447. [23]WANG R,SHIVANNA R,CHENG D,et al.Dcn v2:Improved deep & cross network and practical lessons for web-scale lear-ning to rank systems[C]//Proceedings of the Web Conference 2021.2021:1785-1797. [24]CHEFER H,GUR S,WOLF L.Transformer interpretability beyond attention visualization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:782-791. [25]HAO Y,DONG L,WEI F,et al.Self-attention attribution:In-terpreting information interactions inside transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021,35(14):12963-12971. [26]LIN Z,FENG M,SANTOS C N,et al.A structured self-attentive sentence embedding[C]//Proceedings of the International Conference on Learning Representations.2017. [27]TU D D,SHU C C,YU H Y.Using unified probabilistic matrix factorization for contextual advertisement recommendation[J].RuanJian Xue Bao/Journal of Software,2013,24(3):454-464. [28]HE X,CHUA T S.Neural factorization machines for sparse pre-dictive analytics[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:355-364. [29]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [30]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [31]LIANG B,LIU Q,XU J,et al.Aspect-based sentiment analysis based on multi-attention CNN[J].Journal of Computer Research and Development,2017,54(8):1724-1735. [32]WANG W G,SHEN J B,JIA Y D.Review of visual attentiondetection[J].Ruan Jian Xue Bao/Journal of Software,2019,30(2):416-439. [33]PARK D H,HENDRICKS L A,AKATA Z,et al.Multimodal explanations:Justifying decisions and pointing to the evidence[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8779-8788. [34]CHOI E,BAHADORI M T,SUN J,et al.Retain:An interpretable predictive model forhealthcare using reverse time attention mechanism[C]//Advances in Neural Information Processing Systems.2016:3512-3520. [35]LEE K,ORTEN B,DASDAN A,et al.Estimating conversionrate in display advertising from past erformance data[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2012:768-776. [36]REN K,ZHANG W,RONG Y,et al.User response learning for directly optimizing campaign performance in display advertising[C]//Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.2016:679-688. |
|