计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 111-118.doi: 10.11896/jsjkx.200500101

所属专题: 大数据&数据科学 虚拟专题

• 数据库&大数据&数据科学 • 上一篇    下一篇

虚假评论识别研究综述

袁禄, 朱郑州, 任庭玉   

  1. 北京大学软件与微电子学院 北京 102600
  • 收稿日期:2020-05-21 修回日期:2020-08-22 出版日期:2021-01-15 发布日期:2021-01-15
  • 通讯作者: 朱郑州(zhuzz@ss.pku.edu.cn)
  • 作者简介:yuanlu96@pku.edu.cn
  • 基金资助:
    国家重点研发计划项目(2017YFB1402400)

Survey on Fake Review Recognition

YUAN Lu, ZHU Zheng-zhou, REN Ting-yu   

  1. School of Software & Microelectronics,Peking University,Beijing 102600,China
  • Received:2020-05-21 Revised:2020-08-22 Online:2021-01-15 Published:2021-01-15
  • About author:YUAN Lu,born in 1996,postgraduate,is a member of China Computer Federation.Her main research interests include machine learning and natural language processing.
    ZHU Zheng-zhou,born in 1979,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include learning resource recommendation and knowledge graph.
  • Supported by:
    National Key Research and Development Program of China(2017YFB1402400).

摘要: Web 2.0时代,消费者在在线购物、学习和娱乐时越来越多地依赖在线评论信息,而虚假的评论会误导消费者的决策,影响商家的真实信用,因此有效识别虚假评论具有重要意义。文中首先对虚假评论的范围进行了界定,并从虚假评论识别、形成动机、对消费者的影响以及治理策略4个方面归纳了虚假评论的研究内容,给出了虚假评论研究框架和一般识别方法的工作流程。然后从评论文本内容和评论者及其群组行为两个角度,对近十年来国内外的相关研究成果进行了综述,介绍了虚假评论效果评估的相关数据集和评价指标,统计分析了在公开数据集上实现的虚假评论有效识别方法,并从特征选取、模型方法、训练数据集、评价指标值等方面进行了对比分析。最后对虚假评论识别领域的有标注语料规模限制等未来研究方向进行了探讨。

关键词: 文本特征, 行为特征, 虚假评论, 虚假评论识别, 虚假评论者

Abstract: In Web 2.0 era,consumers mostly rely on online reviews from former consumers when they are shopping,learning and entertaining on the Internet.Fake review can mislead users on making consumption decisions and affect the real credit of stores.Therefore,recognizing fake reviews effectively is necessary and meaningful.This paper first starts from the definition of fake review,introduces the research content of false review from four directions,which are fake review recognition,motivation,influence on consumers and how to prevent false review,and then puts forward the research framework of fake reviews and the workflow of general recognition methods.Then it sums up current perspectives of relevant research from the text of fake reviews and fake reviewers,introduces common datasets and evaluation indicators,statistically analyzes the effective recognition method of fake review on open datasets.Specifically,it makes a conclusion about the feature selection,fake review recognition models,training datasets and evaluation indicators of current research works,and makes a comparison among different detection models.Finally,the future research directions of fake review recognition,such as the limit of large scale labeled datasets are discussed.

Key words: Behavior feature, Fake review, Fake review recognition, Fake reviewer, Textual feature

中图分类号: 

  • TP391
[1] JINDAL N,LIU B.Analyzing and Detecting Review Spam[C]// IEEE International Conference on Data Mining.2007:547-552.
[2] JINDAL N,LIU B,et al.Review spam detection[C]//Procee-dings of the 16th International Conference on World Wide Web.2007:1189-1190.
[3] ZHAO J X.Research on Intelligent Recognition of NetworkTransaction Review Spam[J].Journal of Modern Information,2016,36(4):57-61.
[4] JINDAL N,LIU B.Opinion spam and analysis[C]//Internatio-nal Conference on Web Search & Data Mining.2008:219-230.
[5] ZHAO J,WANG H.Detection of fake reviews based on emotional orientation and logistic regression[J].CAAI Transactions on Intelligent Systems,2016,11(3):336-342.
[6] MENG M R,DING S C.Motivation and Behavior of the Fraud Reviews'Publishers[J].Information Science,2013,31(10):100-104.
[7] LI L Y,QIN B,LIU T.Survey on Fake Review Detection Research[J].Chinese Journal of Computers,2018,41(4):946-968.
[8] LUCA M.Reviews,Reputation,and Revenue:The Case ofYelp.Com[D].Boston:Harvard Business School,2011.
[9] YANG M,QI W,YAN X B,et al.Utility analysis for online product review[J].Journal of Management Sciences in China,2012,15(5):65-75.
[10] OTT M,CARDIE C,HANCOCK J.Estimating the Prevalence of Deception in Online Review Communities[C]//Proceedings of the 21st International Conference on World Wide Web.2012:201-210.
[11] REN Y F,JI D H,ZHANG H B,et al.Deceptive Reviews Detection Based on Positive and Unlabeled Learning[J].Journal of Computer Research and Development,2015,52(3):639-648.
[12] CHEN L,LI W,CHEN H,et al.Detection of fake reviews:Analysis of sellers' manipulation behavior[J].Sustainability,2019,11(17):4802.
[13] DU X M,DING J Y,XIE Z H,et al.An Empirical Study on the Impact of Online Reviews on Consumers'Purchasing Intention[J].Management Review,2016,28(3):173-183.
[14] HU N,BOSE I,GAO Y,et al.Manipulation in digital word-of-mouth:A reality check for book reviews[J].Decision Support Systems,2011,50(3):627-635.
[15] ZHAO Y,YANG S,NARAYAN V,et al.Modeling Consumer Learning from Online Product Reviews[J].Marketing Science,2011,32(1):153-169.
[16] DELLAROCAS C.Strategic Manipulation of Internet Opinion Forums:Implications for Consumers and Firms[J].Management Science,2006,52(10):1577-1593.
[17] ZHENG C D,HAN Q,WANG H.How Do Paid Posters' Comments Affect Your Purchase Intention[J].Nankai Business Review,2015,18(1):89-97.
[18] BHARGAVA R,BAONI A,SHARMA Y.Composite Sequential Modeling for Identifying Fake Reviews[J].Journal of Intelligent Systems,2019,28(3):409-422.
[19] YAN M X,JI D H,REN Y F.Deceptive review detection via hie-rarchical neural network model with attention mechanism[J].Journal of Computer Applications,2019,39(7):1925-1930.
[20] HAJEK P,BARUSHKA A,MUNK M.Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining[J].Neural Computing and Applications,2020(1):1-16.
[21] XU C,ZHANG J,CHANG K Y,et al.Uncovering Collusive Spammers in Chinese Review Websites[C]//Acm International Conference on Conference on Information & Knowledge Ma-nagement.2013:979-988.
[22] MUKHERJEE A,BING L,WANG J,et al.Detecting Group Review Spam[C]//International Conference Companion on World Wide Web.2011:93.
[23] LIM E-P,NGUYEN V A,JINDAL N,et al.Detecting product review spammers using rating behaviors[C]//Proceedings of the 19th ACM International Conference on Information and Know-ledge Management.2010:939-948.
[24] WANG G,XIE S,LIU B,et al.Review Graph Based Online Store Review Spammer Detection[C]//Proceedings of the 2011 IEEE 11th International Conference on Data Mining.2011:1242-1247.
[25] OTT M,CHOI Y,CARDIE C,et al.Finding Deceptive Opinion Spam by Any Stretch of the Imagination[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics(HLT '11).2011:309-319.
[26] CHANG X.Combating Product Review Spam Campaigns viaMultiple Heterogeneous Pairwise Features[C]//Siam International Conference on Data Mining.2015:172-180.
[27] XU C.Detecting collusive spammers in online review communities[C]//Proceedings of the Sixth Workshop on Ph.D.Students in Information and Knowledge Management.2013:33-40.
[28] AHMED H,TRAORE I,SAAD S.Detecting opinion spams and fake news using text classification[J].Security & Privacy,2017,1(4):e9.
[29] LI F,HUANG M,YI Y,et al.Learning to Identify Review Spam[C]//International Joint Conference on Artificial Intelligence.2011:2488-2493.
[30] LIN Y.Towards Online Anti-Opinion Spam:Spotting Fake Reviews from the Review Sequence[C]//IEEE/ACM International Conference on Advances in Social Networks Analysis & Mi-ning.2014:261-264.
[31] BANERJEE S,CHUA A Y K.Authentic versus fictitious online reviews:A textual analysis across luxury,budget,and mid-range hotels[J].Journal of Information Science,2017,43(1):122-134.
[32] MUKHERJEE A,VENKATARAMAN V,LIU B,et al.What Yelp Fake Review Filter Might Be Doing?[C]//International AAAI Conference on Web and Social Media.2013:409-418.
[33] FUSILIER D H,MONTES Y G M,ROSSO P,et al.Detecting positive and negative deceptive opinions using PU-learning[J].Information Processing and Management,2015,51(4):433-443.
[34] REN Y F,JI D H,YIN L.Deceptive Reviews Detection Based on Semi-supervised Learning Algorithm[J].Journal of Sichuan University(Engineering Science Edition),2014,46(3):62-69.
[35] FENG S,BANERJEE R,CHOI Y.Syntactic Stylometry for Deception Detection[C]//Meeting of the Association for Computational Linguistics:Short Papers.2012:8-14.
[36] SHOJAEE S,MURAD M A,AZMAN A B,et al.Detecting Deceptive Reviews Using Lexical and Syntactic Features[C]//International Conference on Intelligent Systems Design & Applications.2014:53-58.
[37] CHEN Y F,LI Z Y.Research on Product Review Attribute-Based of Emotion Evaluate Review Spam Detection[J].New Technology of Library and Information Service,2014(9):81-90.
[38] LI F T.Research on Sentiment Analysis with Product Review[D].Beijing:Tsinghua University,2011.
[39] SONG H X,YAN X,YU Z T,et al.Detection of fake reviews based on adaptive clustering[J].Journal of Nanjing University(Natural Sciences),2013,49(4):433-438.
[40] ZHAO Y Y,QIN B,LIU T.Sentiment Analysis[J].Journal of Software,2010,21(8):1834-1848.
[41] NIE H,WU Y J.Study On Spammer Dectection Based On Reviewer-Specific Characteristics[J].Library and Information Service,2015,59(10):102-109.
[42] POPESCU A M,NGUYEN B,ETZIONI O.OPINE:extracting product features and opinions from reviews[C]//Proceedings of HLT/EMNLP on Interactive Demonstrations.2005:32-33.
[43] KARAMI A,ZHOU B.Online Review Spam Detection by New Linguistic Features[C]//iConference.2015:1-5.
[44] SAVAGE D,ZHANG X,YU X,et al.Detection of opinion spam based on anomalous rating deviation[J].Expert Syst.Appl.,2015,42(22):8650-8657.
[45] YU C M,FENG B L,ZUO Y H,et al.An Individual-Group-Merchant Relation Model for Identifying Online Fake Reviews[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2017,53(2):262-272.
[46] MO Q,YANG K.Overview of Web spammer detection.Ruan Jian Xue Bao[J].Journal of Software,2014,25(7):1505-1526.
[47] BARBADO R,ARAQUE O,IGLESIAS C A.A framework for fake review detection in online consumer electronics retailers[J].Information Processing & Management,2019,56(4):1234-1244.
[48] MUKHERJEE A,KUMAR A,BING L,et al.Spotting opinion spammers using behavioral footprints[C]//Acm Sigkdd International Conference on Knowledge Discovery & Data Mining.2013:632-640.
[49] RUAN N,DENG R,SU C.GADM:Manual fake review detection for O2O commercial platforms[J].Computers & Security,2020,88(1):1-11.
[50] GUAN W,XIE S,BING L,et al.Identify Online Store Review Spammers via Social Review Graph[J].Acm Transactions on Intelligent Systems & Technology,2012,3(4):1-21.
[51] ZHANG Q,JI S J,FU Q,et al.Weighted reviewer graph based spammer group detection and characteristic analysis[J].Journal of Computer Applications,2019,39(6):1595-1600.
[52] DEWNAG R K,SINGH A K.State-of-art approaches for review spammer detection:a survey[J].Journal of Intelligent Information Systems,2017,50(8):1-34.
[53] ZHUO W,GU S,XU X.GSLDA:LDA-based group spamming detection in product reviews[J].Applied Intelligence,2018(1):1-14.
[54] MUKHERJEE A,BING L,GLANCE N.Spotting Fake Revie-wer Groups in Consumer Reviews[C]//International Confe-rence on World Wide Web.2012:191-200.
[55] HAN Z M,YANG K,TAN X S.Analyzing Spectrum Features of Weight User Relation Graph to Identify Large Spammer Groups in Online Shopping Websites[J].Chinese Journal of Computers,2017,40(4):939-954.
[56] LIN Y M,WANG X L,ZHU T,et al.Survey on Quality Evaluation and Control of Online Reviews[J].Journal of Software,2014,25(3):506-527.
[57] LI X,DING S C.Research on Review Spam Recognition[J].New Technology of Library and Information Service,2013(1):63-68.
[58] YU C,ZUO Y,FENG B,et al.An individual-group-merchant relation model for identifying fake online reviews:an empirical study on a Chinese e-commerce platform[J].Information Technology and Management,2019,20(3):123-138.
[59] NOEKHAH S,SALIM N B,ZAKARIA N H.A Novel Model for Opinion Spam Detection Based on Multi-Iteration Network Structure[J].Advanced Science Letters,2018,24(2):1437-1442.
[60] LI H,CHEN Z,BING L,et al.Spotting Fake Reviews via Collective Positive-Unlabeled Learning[C]//IEEE International Conference on Data Mining.2014:899-904.
[61] LI J,OTT M,CARDIE C,et al.Towards a General Rule forIdentifying Deceptive Opinion Spam[C]//Annual Meeting of the Association for Computational Linguistics.2014:1566-1576.
[62] WANG X P,LIU K,HE S Z,et al.Learning to Represent Review with Tensor Decomposition for Spam Detection[C]//2016 Conference on Empirical Methods in Natural Language Proces-sing.2016:866-875.
[63] NAJADA H A,ZHU X.iSRD:Spam Review Detection with Imbalanced Data Distributions[C]//IEEE International Conference on Information Reuse & Integration.2014:553-560.
[64] XIE S,GUAN W,LIN S,et al.Review spam detection via temporal pattern discovery[C]//Acm Sigkdd InternationalConfe-rence on Knowledge Discovery & Data Mining.2012:823-831.
[65] YE J T,KUMAR S,AKOGLU L.Temporal Opinion Spam Detection by Multivariate Indicative Signals[J].arXiv:1603.01929v1,2016.
[66] ZHENG X P.An Empirical Study of the Impact of Online Review on Internet Consumer Purchasing Decision[D].Beijing:Renmin University of China,2008.
[1] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[2] 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩.
融合Bert和图卷积的深度集成学习软件需求分类
Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution
计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[3] 郁友琴, 李弼程.
基于多粒度文本特征表示的微博用户兴趣识别
Microblog User Interest Recognition Based on Multi-granularity Text Feature Representation
计算机科学, 2021, 48(12): 219-225. https://doi.org/10.11896/jsjkx.201100128
[4] 赵澄, 叶耀威, 姚明海.
基于金融文本情感的股票波动预测
Stock Volatility Forecast Based on Financial Text Emotion
计算机科学, 2020, 47(5): 79-83. https://doi.org/10.11896/jsjkx.190400145
[5] 康雁,崔国荣,李浩,杨其越,李晋源,王沛尧.
融合自注意力机制和多路金字塔卷积的软件需求聚类算法
Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution
计算机科学, 2020, 47(3): 48-53. https://doi.org/10.11896/jsjkx.190700146
[6] 张敏军, 华庆一.
基于概率矩阵分解算法的社交网络用户兴趣点个性化推荐
Personalized Recommendation of Social Network Users' Interest Points Based on ProbabilityMatrix Decomposition Algorithm
计算机科学, 2020, 47(12): 144-148. https://doi.org/10.11896/jsjkx.191000064
[7] 葛绍林, 叶剑, 何明祥.
基于深度森林的用户购买行为预测模型
Prediction Model of User Purchase Behavior Based on Deep Forest
计算机科学, 2019, 46(9): 190-194. https://doi.org/10.11896/j.issn.1002-137X.2019.09.027
[8] 陈晋音, 黄国瀚, 吴洋洋, 贾澄钰.
基于双循环图的虚假评论检测算法
Double Cycle Graph Based Fraud Review Detection Algorithm
计算机科学, 2019, 46(9): 229-236. https://doi.org/10.11896/j.issn.1002-137X.2019.09.034
[9] 李存燕, 洪玫.
Github中开发人员的行为特征分析
Analysis on Behavior Characteristics of Developers in Github
计算机科学, 2019, 46(2): 152-158. https://doi.org/10.11896/j.issn.1002-137X.2019.02.024
[10] 杨超, 秦廷栋, 范波, 李涛.
基于人工免疫危险理论的微博水军用户检测研究
Study on Detection of Weibo Spammers Based on Danger Theory in Artificial Immunity System
计算机科学, 2018, 45(11): 138-142. https://doi.org/10.11896/j.issn.1002-137X.2018.11.020
[11] 刘金硕,张智.
一种基于联合深度神经网络的食品安全信息情感分类模型
Sentiment Analysis on Food Safety News Using Joint Deep Neural Network Model
计算机科学, 2016, 43(12): 277-280. https://doi.org/10.11896/j.issn.1002-137X.2016.12.051
[12] 杜楠,韩兰胜,付才,张忠科,刘铭.
基于相识度的恶意代码检测
Detection of Malware Code Based on Acquaintance Degree
计算机科学, 2015, 42(1): 187-192. https://doi.org/10.11896/j.issn.1002-137X.2015.01.042
[13] 雷庆,陈锻生,李绍滋.
复杂场景下的人体行为识别研究新进展
Advances on Human Action Recognition in Realistic Scenes
计算机科学, 2014, 41(12): 1-7. https://doi.org/10.11896/j.issn.1002-137X.2014.12.001
[14] 王琢,李准,徐野,宋凯.
基于评论图的虚假产品评论人的检测
Detecting Product Review Spammers Based on Review Graphs
计算机科学, 2014, 41(10): 295-299. https://doi.org/10.11896/j.issn.1002-137X.2014.10.062
[15] .
基于混合并行遗传聚类的文本特征抽取方法研究

计算机科学, 2008, 35(9): 183-186.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!