计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 32-38.doi: 10.11896/jsjkx.180901801

• 大数据与数据科学* • 上一篇    下一篇

基于嵌入学习的用户动态偏好预测

温雯1, 林泽钿1, 蔡瑞初1, 郝志峰1,2, 王丽娟1   

  1. (广东工业大学计算机学院 广州510000)1
    (佛山科学技术学院数学与大数据学院 广东 佛山528000)2
  • 收稿日期:2018-09-27 修回日期:2019-03-14 出版日期:2019-10-15 发布日期:2019-10-21
  • 通讯作者: 林泽钿(1993-),男,硕士生,主要研究方向为数据挖掘、机器学习、信息检索,E-mail:906268746@qq.com。
  • 作者简介:温雯(1981-),女,博士,副教授,CCF会员,主要研究方向为机器学习、模式识别、信息检索;蔡瑞初(1983-),男,博士,教授,CCF高级会员,主要研究方向为数据挖掘、机器学习、信息检索;郝志峰(1968-),男,博士,教授,CCF会员,主要研究方向为机器学习、人工智能;王丽娟(1978-),女,博士,副教授,主要研究方向为机器学习、高维数据聚类分析。
  • 基金资助:
    本文受国家自然科学基金(61472089),NSFC-广东联合基金(U1501254)资助。

Predicting User’s Dynamic Preference Based on Embedding Learning

WEN Wen1, LIN Ze-tian1, CAI Rui-chu1, HAO Zhi-feng1,2, WANG Li-juan1   

  1. (School of Computers,Guangdong University of Technology,Guangzhou 510000,China)1
    (School of Mathematics and Big Data,Foshan University,Foshan,Guangdong 528000,China)2
  • Received:2018-09-27 Revised:2019-03-14 Online:2019-10-15 Published:2019-10-21

摘要: 传统的刻画用户偏好的方法主要着眼于用户的长期兴趣,然而在现实应用中,用户兴趣随着时间迁移而不断变化,如何挖掘用户在时序上的动态偏好仍然面临挑战。为此,文中提出了一种基于嵌入学习的动态行为预测方法。首先,利用改进的词嵌入模型从用户的点击行为序列中学习获得每一个点击项的低维向量表示;然后,基于所学习的向量表示,结合用户近期点击行为推断用户的动态偏好,进而预测其下一步的点击行为。在两个真实数据集上将提出的方法与近年出现的其他基准方法进行比较,结果表明,所提方法在预测准确率上具有明显的优势。

关键词: word2vec, 嵌入, 时序行为, 行为预测, 用户兴趣

Abstract: Traditional methods for capturing user preferences mainly focus on user’s long-term preferences.However,user interests always change over time in real-world applications.As a result,how to capture user’s dynamic prefe-rences still remains a big challenge.This paper proposed an embedding-based approach for predicting user’s dynamic preferences.Firstly,an improved embedding method is used for learning the low-dimensional vector representations of items from user’s click sequences.Then,based on the learned item vectors and user’s short-term click behaviors,user’sdynamic preferences are obtained and used for predicting the next click.Experiments were conducted on two real-world datasets and the proposed method was compared with state-of-the-art methods.The results demonstrate the significant superiority of the proposed method in prediction accuracy compared with other algorithms.

Key words: Behavior prediction, Embedding, Temporal behaviors, User preferences, Word2vec

中图分类号: 

  • TP181
[1]XU H L,WU X,LI X D,et al.Comparison Study of Internet Recommendation System[J].Journal of Software,2009,20(2):350-362.(in Chinese)
许海玲,吴潇,李晓东,等.互联网推荐系统比较研究[J].软件学报,2009,20(2):350-362.
[2]ADOMAVICIUS G,TUZHILIN A.Context-Aware recommender systems [C]//Proceedings of the RecSys 2008.New York:ACM Press,2008:335-336.
[3]XIANG L,YUAN Q,ZHAO S,et al.Temporal recommendation on graphs via long-and short-term preference fusion[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2010:723-732.
[4]LINDEN G,SMITH B,YORK J.Amazon.com recommenda-tions:Item-to-item collaborative filtering[J].IEEE Internet Computing,2003,7(1):76-80.
[5]DING Y,LI X.Time weight collaborative filtering[C]//Proceedings of the 14th ACM international conference on Information and knowledge management.ACM,2005:485-492.
[6]BARKAN O,KOENIGSTEIN N.Item2vec:neural item embedding for collaborative filtering[C]//2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).IEEE,2016:1-6.
[7]GRBOVIC M,RADOSAVLJEVIC V,DJURIC N,et al.E-com-merce in your inbox:Product recommendations at scale[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:1809-1818.
[8]LIU J,DENG G.Link prediction in a user-object network based on time-weighted resource allocation[J].Physica A:Statistical Mechanics and its Applications,2009,388(17):3643-3650.
[9]YU J,SHEN Y,YANG Z.Topic-STG:Extending the session-based temporal graph approach for personalized tweet recommendation[C]//Proceedings of the 23rd International Confe-rence on World Wide Web.ACM,2014:413-414.
[10]NZEKO’O A J N,TCHUENTE M,LATAPY M.Time Weight Content-Based Extensions of Temporal Graphs for Personalized Recommendation[C]//WEBIST 2017-13th International Conference on Web Information Systems and Technologies.2017.
[11]KOREN Y.Collaborative filtering with temporal dynamics [C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2009:447-456.
[12]LIU N N,ZHAO M,XIANG E,et al.Online evolutionary collaborative filtering[C]//Proceedings of the fourth ACMConfe-rence on Recommender Systems.ACM,2010:95-102.
[13]MEI Q,ZHAI C X.Discovering evolutionary theme patterns from text:an exploration of temporal text mining[C]//Procee-dings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining.ACM,2005:198-207.
[14]WANG X,ZHAI C X,HU X,et al.Mining correlated bursty topic patterns from coordinated text streams[C]//Proceedings of the 13th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.ACM,2007:784-793.
[15]BLEI D M,LAFFERTY J D.Dynamic topic models[C]//Proceedings of the 23rd International Conference on Machine Learning.ACM,2006:113-120.
[16]GOHR A,HINNEBURG A,SCHULT R,et al.Topic evolution in a stream of documents[C]//Proceedings of the 2009 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2009:859-870.
[17]ALSUMAIT L,BARBARA D,DOMENICONI C.On-line lda:Adaptive topic models for mining text streams with applications to topic detection and tracking[C]//Eighth IEEE International Conference on Data Mining,2008(ICDM’08).IEEE,2008:3-12.
[18]DIAO Q,JIANG J,ZHU F,et al.Finding bursty topics from microblogs[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1.Association for Computational Linguistics,2012:536-544.
[19]YIN H,CUI B,CHEN L,et al.A temporal context-aware model for user behavior modeling in social media systems[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.ACM,2014:1543-1554.
[20]YIN H,CUI B,CHEN L,et al.Dynamic user modeling in social media systems[J].ACM Transactions on Information Systems (TOIS),2015,33(3):10.
[21]BAEZA-YATES R,RIBEIRO-NETO B.Modern information retrieval[M].New York:ACM press,1999.
[22]LAVRENKO V,CROFT W B.Relevance-based language mo-dels[J].ACM SIGIR Forum,2017,51(2):260-267.
[23]TURIAN J,RATINOV L,BENGIO Y.Word representations:a simple and general method for semi-supervised learning[C]//Proceedings of the 48th annual meeting of the association for computational linguistics.Association for Computational Linguistics,2010:384-394.
[24]COLLOBERT R,WESTON J,BOTTOU L,et al.Natural language processing (almost) from scratch[J].Journal of Machine Learning Research,2011,12(8):2493-2537.
[25]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[26]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[M]//Advances in Neural Information Processing Systems.Berlin:Springer,2013:3111-3119.
[27]GOLDBERG Y,LEVY O.word2vec Explained:deriving Miko-lov et al.’s negative-sampling word-embedding method[J].arXiv:1402.3722,2014.
[28]RADEV D R,QI H,WU H,et al.Evaluating web-based question answering systems[C]//Proceedings of the 3rd International Conference on Language Resources and Evaluation.2002.
[29]CHOI K,SUH Y.A new similarity function for selecting neighbors for each target item in collaborative filtering[J].Know-ledge-Based Systems,2013,37(1):146-153.
[1] 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚.
融合双向门控循环单元和注意力机制的软件自承认技术债识别方法
Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism
计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[2] 帅剑波, 王金策, 黄飞虎, 彭舰.
基于神经架构搜索的点击率预测模型
Click-Through Rate Prediction Model Based on Neural Architecture Search
计算机科学, 2022, 49(7): 10-17. https://doi.org/10.11896/jsjkx.210600009
[3] 朴勇, 朱锶源, 李阳.
融合用户和区位资源特征的混合房源推荐方法
Hybrid Housing Resource Recommendation Based on Combined User and Location Characteristics
计算机科学, 2022, 49(6A): 733-737. https://doi.org/10.11896/jsjkx.210800062
[4] 李勇, 吴京鹏, 张钟颖, 张强.
融合快速注意力机制的节点无特征网络链路预测算法
Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism
计算机科学, 2022, 49(4): 43-48. https://doi.org/10.11896/jsjkx.210800276
[5] 杨辉, 陶力宏, 朱建勇, 聂飞平.
基于锚点的快速无监督图嵌入
Fast Unsupervised Graph Embedding Based on Anchors
计算机科学, 2022, 49(4): 116-123. https://doi.org/10.11896/jsjkx.210200098
[6] 钟桂凤, 庞雄文, 隋栋.
基于Word2Vec和改进注意力机制AlexNet-2的文本分类方法
Text Classification Method Based on Word2Vec and AlexNet-2 with Improved AttentionMechanism
计算机科学, 2022, 49(4): 288-293. https://doi.org/10.11896/jsjkx.211100016
[7] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[8] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[9] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
[10] 李玉强, 张伟江, 黄瑜, 李琳, 刘爱华.
基于高斯分布的改进词嵌入主题情感模型
Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution
计算机科学, 2022, 49(2): 256-264. https://doi.org/10.11896/jsjkx.201200082
[11] 李昭奇, 黎塔.
基于wav2vec预训练的样例关键词识别
Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining
计算机科学, 2022, 49(1): 59-64. https://doi.org/10.11896/jsjkx.210900007
[12] 陈晋鹏, 胡哈蕾, 张帆, 曹源, 孙鹏飞.
融合时间特性和用户偏好的卷积序列化推荐
Convolutional Sequential Recommendation with Temporal Feature and User Preference
计算机科学, 2022, 49(1): 115-120. https://doi.org/10.11896/jsjkx.201200192
[13] 郑苏苏, 关东海, 袁伟伟.
融合不完整多视图的异质信息网络嵌入方法
Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion
计算机科学, 2021, 48(9): 68-76. https://doi.org/10.11896/jsjkx.210500203
[14] 孙圣姿, 郭炳晖, 杨小博.
用于多模态语义分析的嵌入共识自动编码器
Embedding Consensus Autoencoder for Cross-modal Semantic Analysis
计算机科学, 2021, 48(7): 93-98. https://doi.org/10.11896/jsjkx.200600003
[15] 李鹏, 刘力军, 黄永东.
基于地标表示的联合谱嵌入和谱旋转的谱聚类算法
Landmark-based Spectral Clustering by Joint Spectral Embedding and Spectral Rotation
计算机科学, 2021, 48(6A): 220-225. https://doi.org/10.11896/jsjkx.210100167
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!