Computer Science ›› 2023, Vol. 50 ›› Issue (5): 12-20.doi: 10.11896/jsjkx.221000032

• Explainable AI • Previous Articles     Next Articles

Study on Interpretable Click-Through Rate Prediction Based on Attention Mechanism

YANG Bin1, LIANG Jing2, ZHOU Jiawei2, ZHAO Mengci3   

  1. 1 China Unicom Research Institute,Beijing 100048,China
    2 School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China
    3 School of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2022-10-05 Revised:2023-02-26 Online:2023-05-15 Published:2023-05-06
  • About author:YANG Bin,born in 1986,Ph.D,is a member of China Computer Federation.His main research interests include re-commended algorithm and natural language processing.

Abstract: Click-Through Rate(CTR) prediction is critical to recommender systems.The improvement of CTR prediction can directly affect the earnings target of the recommender system.The performance and interpretation of the CTR prediction algorithm can guide developers to understand and evaluate recommender system accurately.That's also helpful for system design.Most existing approaches are based on linear feature interaction and deep feature extraction,which have poor model interpretation in the outcomes.Moreover,very few previous studies were conducted on the model interpretation of the CTR prediction.Therefore,in this paper,we propose a novel model which introduces multi-head self-attention mechanism to the embedding layer,the linear feature interaction component and the deep component,to study the model interpretation.We propose two models for the deep component.One is deep neural networks(DNN) enhanced by multi-head self-attention mechanism,the other computes high-order feature interaction by stacking multiple attention blocks.Furthermore,we calculate attention scores and interpret the prediction results for each component.We conduct extensive experiments using three real-world benchmark datasets.The results show that the proposed approach not only improves the effect of DeepFM effectively but also offers good model interpretation.

Key words: Recommender system, Click-Through Rate prediction, Multi-head self-attention mechanism, Feature interaction, Model interpretability

CLC Number: 

  • TP391
[1]XIANG L.Recommender system practice[M].People PostPress,2012.
[2]WANG Z.Deep Learning Recommender System[M].Electronic Industry Press,2020.
[3]GUO H,TANG R,YE Y,et al.DeepFM:a factorization-machine based neural network for CTR prediction[J].arXiv:1703.04247,2017.
[4]LIU M J,ZENG G C,YUE W,et al.Review on click-through rate prediction models for display advertising[J].Computer Science,2019,46(7):38-49.
[5]CHEN T,GUESTRIN C.Xgboost:A scalable tree boosting system[C]//Proceedings of the 22ndACM Sigkdd International Conference on Knowledge Discovery and Data Mining.2016:785-794.
[6]HE X J,PAN W J,CHENG H.An advertisement click-through rate prediction model based on ensemble learning[J].Computing Engineering & Science,2019,41(12):2278-2284.
[7]RENDLE S.Factorization machines[C]//2010 IEEE Interna-tional Conference on Data Mining.IEEE,2010:995-1000.
[8]JUAN Y,ZHUANG Y,CHIN W S,et al.Field-aware factorization machines for CTR prediction[C]//Proceedings of the 10th ACM Conference on Recommender Systems.2016:43-50.
[9]HE X,PAN J,JIN O,et al.Practical lessons from predicting clicks on ads at facebook[C]//Proceedings of the Eighth International Workshop on Data Mining for Online Advertising.2014:1-9.
[10]ZHANG W,DU T,WANG J.Deep learning over multi-field ca-tegorical data[C]//European Conference on Information Retrie-val.Cham:Springer,2016:45-57.
[11]CHENG H T,KOC L,HARMSEN J,et al.Wide & deep lear-ning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems.2016:7-10.
[12]WANG R,FU B,FU G,et al.Deep & cross network for ad click predictions[M]//Proceedings of the ADKDD' 17.2017:1-7.
[13]LIAN J,ZHOU X,ZHANG F,et al.xdeepfm:Combining expli-cit and implicit feature interactions for recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1754-1763.
[14]NING T,MIAO D Z,DONG Q W,et al.Wide and deep learning for default risk prediction[J].Computer Science,2021,48(5):197-201.
[15]DING Y,LEI X,LIAO B,et al.MLRDFM:a multi-view Laplacian regularized DeepFM model for predicting miRNA-disease associations[J].Briefings in Bioinformatics,2022,23(3):bbac079.
[16]CAO B Q,XIAO Q X,ZHANG X P,et al.An API service re-commendation method via combining self-organization map-based functionality clustering and deep factorization machine-based quality prediction[J].Chinese Journal of Computers,2019,42(6):1367-1383.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems,2017:6000-6010.
[18]HAN K,WANG Y,CHEN H,et al.A survey on vision transformer[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(1):87-110.
[19]XIAO J,YE H,HE X,et al.Attentional factorization machines:Learning the weight of feature interactions via attention networks[C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.2017:3119-3125.
[20]SONG W,SHI C,XIAO Z,et al.AutoInt:Automatic feature interaction learning via self-attentive neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:1161-1170.
[21]HUANG T,ZHANG Z,ZHANG J.FiBiNET:combining feature importance and bilinear feature interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems.2019:169-177.
[22]YU R,YE Y,LIU Q,et al.Xcrossnet:Feature structure-oriented learning for click-through rate prediction[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Cham:Springer,2021:436-447.
[23]WANG R,SHIVANNA R,CHENG D,et al.Dcn v2:Improved deep & cross network and practical lessons for web-scale lear-ning to rank systems[C]//Proceedings of the Web Conference 2021.2021:1785-1797.
[24]CHEFER H,GUR S,WOLF L.Transformer interpretability beyond attention visualization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:782-791.
[25]HAO Y,DONG L,WEI F,et al.Self-attention attribution:In-terpreting information interactions inside transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021,35(14):12963-12971.
[26]LIN Z,FENG M,SANTOS C N,et al.A structured self-attentive sentence embedding[C]//Proceedings of the International Conference on Learning Representations.2017.
[27]TU D D,SHU C C,YU H Y.Using unified probabilistic matrix factorization for contextual advertisement recommendation[J].RuanJian Xue Bao/Journal of Software,2013,24(3):454-464.
[28]HE X,CHUA T S.Neural factorization machines for sparse pre-dictive analytics[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:355-364.
[29]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[30]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[31]LIANG B,LIU Q,XU J,et al.Aspect-based sentiment analysis based on multi-attention CNN[J].Journal of Computer Research and Development,2017,54(8):1724-1735.
[32]WANG W G,SHEN J B,JIA Y D.Review of visual attentiondetection[J].Ruan Jian Xue Bao/Journal of Software,2019,30(2):416-439.
[33]PARK D H,HENDRICKS L A,AKATA Z,et al.Multimodal explanations:Justifying decisions and pointing to the evidence[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8779-8788.
[34]CHOI E,BAHADORI M T,SUN J,et al.Retain:An interpretable predictive model forhealthcare using reverse time attention mechanism[C]//Advances in Neural Information Processing Systems.2016:3512-3520.
[35]LEE K,ORTEN B,DASDAN A,et al.Estimating conversionrate in display advertising from past erformance data[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2012:768-776.
[36]REN K,ZHANG W,RONG Y,et al.User response learning for directly optimizing campaign performance in display advertising[C]//Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.2016:679-688.
[1] DOU Zhi, HU Chenguang, LIANG Jingyi, ZHENG Liming, LIU Guoqi. Lightweight Target Detection Algorithm Based on Improved Yolov4-tiny [J]. Computer Science, 2023, 50(6A): 220700006-7.
[2] HAO Jingyu, WEN Jingxuan, LIU Huafeng, JING Liping, YU Jian. Deep Disentangled Collaborative Filtering with Graph Global Information [J]. Computer Science, 2023, 50(1): 41-51.
[3] CHENG Zhang-tao, ZHONG Ting, ZHANG Sheng-ming, ZHOU Fan. Survey of Recommender Systems Based on Graph Learning [J]. Computer Science, 2022, 49(9): 1-13.
[4] WANG Guan-yu, ZHONG Ting, FENG Yu, ZHOU Fan. Collaborative Filtering Recommendation Method Based on Vector Quantization Coding [J]. Computer Science, 2022, 49(9): 48-54.
[5] SHUAI Jian-bo, WANG Jin-ce, HUANG Fei-hu, PENG Jian. Click-Through Rate Prediction Model Based on Neural Architecture Search [J]. Computer Science, 2022, 49(7): 10-17.
[6] GUO Liang, YANG Xing-yao, YU Jiong, HAN Chen, HUANG Zhong-hao. Hybrid Recommender System Based on Attention Mechanisms and Gating Network [J]. Computer Science, 2022, 49(6): 158-164.
[7] CHEN Zhuang, ZOU Hai-tao, ZHENG Shang, YU Hua-long, GAO Shang. Diversity Recommendation Algorithm Based on User Coverage and Rating Differences [J]. Computer Science, 2022, 49(5): 159-164.
[8] LI Kang-lin, GU Tian-long, BIN Chen-zhong. Multi-space Interactive Collaborative Filtering Recommendation [J]. Computer Science, 2021, 48(12): 181-187.
[9] ZHU Yu-jie, LIU Hu-chen. Research on Multi-recommendation Fusion Algorithm of Online Shopping Platform [J]. Computer Science, 2021, 48(11A): 232-235.
[10] YU Li, DU Qi-han, YUE Bo-yan, XIANG Jun-yao, XU Guan-yu, LENG You-fang. Survey of Reinforcement Learning Based Recommender Systems [J]. Computer Science, 2021, 48(10): 1-18.
[11] ZOU Hai-tao, ZHENG Shang, WANG Qi, YU Hua-long and GAO Shang. Adaptive High-order Rating Distance Recommendation Model Based on Newton Optimization [J]. Computer Science, 2020, 47(6A): 494-499.
[12] FENG Chen-jiao,LIANG Ji-ye,SONG Peng,WANG Zhi-qiang. New Similarity Measure Based on Extremely Rating Behavior [J]. Computer Science, 2020, 47(2): 31-36.
[13] ZHANG Yan-hong, ZHANG Chun-guang, ZHOU Xiang-zhen, WANG Yi-ou. Diverse Video Recommender Algorithm Based on Multi-property Fuzzy Aggregate of Items [J]. Computer Science, 2019, 46(8): 78-83.
[14] LIU Meng-juan,ZENG Gui-chuan,YUE Wei,QIU Li-zhou,WANG Jia-chang. Review on Click-through Rate Prediction Models for Display Advertising [J]. Computer Science, 2019, 46(7): 38-49.
[15] CHEN Jun-hang, XU Xiao-ping, YANG Heng-hong. Research on Recommendation Application Based on Seq2seq Model [J]. Computer Science, 2019, 46(6A): 493-496.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!