计算机科学 ›› 2020, Vol. 47 ›› Issue (2): 65-71.doi: 10.11896/jsjkx.190200362

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于文本深层语义特征的亚马逊商品推荐

李可1,陈光平2   

  1. (重庆教育科学研究院 重庆400015)1;
    (中国计量大学信息工程学院 杭州310018)2
  • 收稿日期:2019-02-26 出版日期:2020-02-15 发布日期:2020-03-18
  • 通讯作者: 李可(6790544@qq.com)
  • 基金资助:
    重庆市教育科学“十三五”规划2016年度重点规划课题(2016-00-011);重庆第二师范学院特指项目(KY2018TZ03)

Mining Deep Semantic Features of Reviews for Amazon Commodity Recommendation

LI Ke1,CHEN Guang-ping2   

  1. (Chongqing Research Academy of Education Sciences,Chongqing 400015,China)1;
    (College of Informatica Engineering,China Jiliang University,Hangzhou 310018,China)2
  • Received:2019-02-26 Online:2020-02-15 Published:2020-03-18
  • About author:LI Ke,born in 1977.His main research interests include information technology education and AI education.
  • Supported by:
    This work was supported by the research program of Chongqing Education Science “13th Five-Year” Plan (2016-00-011) and Particular Research Program of Chongqing University of Education (KY2018TZ03).

摘要: 商品评论挖掘在商品推荐领域取得了越来越多的成果。传统的评论挖掘方法只集中在挖掘评论中隐含的浅层语义,其语义表达效果不理想。因此,目前商品推荐领域的一大挑战是如何挖掘商品评论的深层语义,提升语义表达能力,以及最大化地利用商品评论来提升商品的推荐效果。文中使用深度学习中的跨思维向量模型(Skip-Thought Vectors,STV)来学习评论的潜在语义特征。为了提升评论的语义表达能力,把深度学习中的长短记忆模型(Long Short-Term Memory,LSTM)应用于STV,结合双向信息流挖掘方法、用户情感偏好挖掘方法以及深度层级模型,引入了一种深层语义特征挖掘模型。该模型不仅能挖掘评论的深层语义特征,还能挖掘发表评论的用户的情感偏好。然后,将深层语义特征挖掘模型与矩阵分解模型(Singular Value Decomposition,SVD)相结合来实现商品推荐。在两个亚马逊数据集上的实验结果证明,所提模型在深度语义挖掘能力上优于传统的评论挖掘模型,相比使用传统评论挖掘模型的商品推荐系统提升了商品推荐的效果。

关键词: 矩阵分解模型, 商品推荐, 深度学习, 文本表示, 语义挖掘

Abstract: Review mining plays an important role in the field of recommender system (RS).However,conventional mining methodscannot explicitly mine deep semantic features of reviews.Therefore,the major challenge in RS is how to mine deep semantics of reviews.This paper utilized Skip-Thought Vectors (STV) to learn latent semantic features of reviews.In addition,in order to enhance the ability of semantic representation of reviews,it introduced the Long Short-Term Memory (LSTM) network into STV,and proposed a deeply hierarchical bi-directional feature-extraction model in combination with bi-directional information mining method,user preference mining method and deeply hierarchical model.The introduced model can not only mine the deep semantic feature of reviews,but also mine the user’s emotional preferences.Then,the proposed model is combined with the Singular Value Decomposition (SVD) model.Experiments on two Amazon datasets show that the proposed model performs better than conventional models due to its strong ability of deep semantics mining of reviews.

Key words: Commodity recommendation, Deep learning, Semantic mining, Singular value decomposition, Text representation

中图分类号: 

  • TP391
[1]SCHAFER J B,KONSTAN J A,RIEDL J.E-Commerce Recommendation Applications[J].Data Mining and Knowledge Disco-very,2001,5(1):115-153.
[2]LINDEN G,SMITH B,YORK J.Amazon.com Recommendations:Item-to-Item Collaborative Filtering[J].IEEE Internet Computing,2003,7(1):76-80.
[3]HORRIGAN J A.Online shopping.In Pew Internet & American Life Project Report [OL] https://www.pewinternet.org/2008/02/13/online-shopping-2/.
[4]HERLOCKER J L,KONSTAN J A,TERVEEN L G,et al. Evaluating collaborative filtering recommender systems[J].ACM Transactions on Information Systems (TOIS),2004,22(1):5-53.
[5]CAMPOS L M D,FERNÁNDEZ-LUNA J M,HUETE J F,et al.Combining content-based and collaborative recommendations:A hybrid approach based on Bayesian networks[J].International Journal of Approximate Reasoning,2010,51(7):785-799.
[6]LIANG C Y,LENG Y J,WANG Y S,et al.Research on Group Recommendation in E-commerce Recommender Systems[J].Chinese Journal of Management Science,2013(3):153-158.
[7]GANU G,ELHADAD N,MARIAN A.Beyond the Stars:Improving Rating Predictions using Review Text Content [C]∥Conference:12th International Workshop on the Web and Databases,WebDB 2009.Rhode Island,Usa,2009,9:1-6.
[8]DIAO Q,QIU M,WU C Y,et al.Jointly modeling aspects,ra-tings and sentiments for movie recommendation (jmars)[C]∥Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2014:193-202.
[9]LE Q V,MIKOLOV T.Distributed Representations of Sen-tences and Documents [C]∥International Conference on Machine Learning.2014,4:1188-1196.
[10]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[11]SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
[12]WU Y,SCHUSTER M,CHEN Z,et al.Google’s Neural Ma-chine Translation System:Bridging the Gap between Human and Machine Translation[J].arXiv:1609.08144.
[13]KIROS R,ZHU Y,SALAKHUTDINOV R R,et al.Skip-thought vectors[C]∥Advances in Neural Information Proces-sing Systems.2015:3294-3302.
[14]CHEN H,SUN M,TU C,et al.Neural Sentiment Classification with User and Product Attention[C]∥Conference on Empirical Methods in Natural Language Processing.2016:1650-1659.
[15]MAJUMDER N,PORIA S,GELBUKH A,et al.Deep learning-based document modeling for personality detection from text[J].IEEE Intelligent Systems,2017,32(2):74-79.
[16]ZHANG L,WANG S,LIU B.Deep learning for sentiment ana-lysis:A survey[J].arXiv:1801.07883.
[17]HU F,XU X,WANG J,et al.Memory-Enhanced Latent Semantic Model:Short Text Understanding for Sentiment Analysis[C]∥International Conference on Database Systems for Advanced Applications.Springer,Cham,2017:393-407.
[18]BACCIANELLA S,ESULI A,SEBASTIANI F.SentiWordNet 3.0:An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining[OL].http://nmis.isti.cnr.it/sebastiani/Publications/LREC10.pdf.
[19]PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[J].ICML (3),2013,28:1310-1318.
[20]GEOFFREY E H,NITISH S,ALEX K S,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].arXiv:1207.0580.
[21]MCAULEY J J,LESKOVEC J.From amateurs to connoisseurs:modeling the evolution of user expertise through online reviews[C]∥Proceedings of the 22nd International Conference on World Wide Web.ACM,2013:897-908.
[22]CHRISTOPHER O.Understanding LSTM Networks[OL].http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
[23]HU F,LI L,XU X,et al.Opinion extraction by distinguishing term dependencies and digging deep text features[J].Neural Computing & Applications,2018 (7):1-11.
[24]PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[C]∥International Conference on Machine Learning.2013,3:1310-1318.
[25]HE K M,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[26]KIM Y.Convolutional neural networks for sentence classification[C]∥EMNLP.2014:1746-1751.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[14] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[15] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!