基于双循环图的虚假评论检测算法

doi:10.11896/j.issn.1002-137X.2019.09.034

摘要/Abstract

摘要： 由于对商店的在线评论能给顾客提供许多有价值的信息并极大地影响商店的信誉度,因此,在利益的驱使下出现了大量虚假评论,扰乱了市场秩序。许多商店或个人通过虚假评论故意吹捧或诋毁特定商店,从而达到获利的目的,因此提出有效的虚假评论检测方法至关重要。文中基于大量用户、评论和商店之间的关系构建图过滤器,经过迭代计算获得用户、评论和商店的置信度,从而发现虚假评论。其中包括3个关键问题:获取可靠的用户、评论和商店置信度,有效地辨识真实评论,准确发现虚假评论及虚假用户。针对提高用户、评论和商店置信度的可靠性问题,文中提出了一种循环迭代的方法来获取可靠的用户、评论和商店置信度;为了更加有效地发现虚假评论和虚假用户,设计了一种加权图过滤器,通过与获取的可靠置信度结合,得到了一种双循环图过滤检测算法。将所提检测算法应用到Yelp数据集上展开实验,验证了所虚假检测算法可以有效检测虚假评论。

关键词: 基于图的过滤器, 双循环图, 行为特征, 虚假检测, 用户影响力

Abstract: Because online reviews of stores can provide customers with a lot of valuable information and greatly affect the credibility of stores,a large number of spam reviews are emerged to disturb the order of market for pro-fit.Many stores or individuals deliberately flatter or denigrate certain stores through fake reviews to achieve their profit objectives.Thus an efficient fraud review detection algorithm is crucial.This paper built a graph filter based on the relationships among users,comments and stores,and obtained the reliability of users,comments and stores through iterative calculation,so as to find fake reviews.Three key questions are brought up:to get more reliable reliability of users,comments and stores,to identify the real reviews effectively,and to detect fake reviews and spammers effectively.In order to improve the reliability of users,comments and stores,a double cycle graph based detection algorithm was proposed to obtain reliable users,comments and stores.In order to find fake reviews and spammers effectively,this paper designed a novel weighted graph filter,through the combination of reliability and obtain reliable,and put forward double cycle filtering detection algorithm.The proposed detection algorithm is applied to Yelp datasets for experiments and proved efficiently in detection of spammers and identifies real reviews.

Key words: Behavior characteristic, Double cycle graph, Graph-based filter, Spam detection, User influence

中图分类号:

TP393.1

陈晋音, 黄国瀚, 吴洋洋, 贾澄钰. 基于双循环图的虚假评论检测算法[J]. 计算机科学, 2019, 46(9): 229-236. https://doi.org/10.11896/j.issn.1002-137X.2019.09.034

CHEN Jin-yin, HUANG Guo-han, WU Yang-yang, JIA Cheng-yu. Double Cycle Graph Based Fraud Review Detection Algorithm[J]. Computer Science, 2019, 46(9): 229-236. https://doi.org/10.11896/j.issn.1002-137X.2019.09.034

参考文献

[1]LI J,OTT M,CARDIE C,et al.Towards a General Rule forIdentifying Deceptive Opinion Spam[C]//Meeting of the Asso-ciation for Computational Linguistics.Baltimere,USA,2014:1566-1576.
[2]LAU R Y K,LIAO S Y,KWOK C W,et al.Text mining and probabilistic language modeling for online review spam detection[J].ACM Transactions on Management Information Systems,2012,2(4):1-30.
[3]LI F,HUANG M,YANG Y,et al.Learning to identify review spam[C]//International Joint Conference on Artificial Intelligence.AAAI Press,2011:2488-2493.
[4]JINDAL N,LIU B.Opinion spam and analysis[C]//International Conference on Web Search & Data Mining.ACM,2008:219-230.
[5]OTT M,CHOI Y,CARDIE C,et al.Finding deceptive opinion spam by any stretch of the imagination[C]//Meeting of the Association for Computational Linguistics:Human Language Technologies.Association for Computational Linguistics,2011:309-319.
[6]MUKHERJEE A,VENKATARAMAN V,LIU B,et al.Fake review detection:Classification and analysis of real and pseudo reviews[D].Chicago:University of Illinois,2013.
[7]YOO K H,GRETZEL U.Comparison of deceptive and truthful travel reviews[M]//Information and Communication Technologies in Tourism,2009.Vienna:Springer,2009:37-47.
[8]MUKHERJEE A,LIU B,GLANCE N.Spotting fake reviewer groups in consumer reviews[C]//International Conference on World Wide Web.ACM,2012:191-200.
[9]LI F,HUANG M,YANG Y,et al.Learning to identify review spam[C]//International Joint Conference on Artificial Intelligence.AAAI Press,2011:2488-2493.
[10]FEI G,MUKHERJEE A,LIU B,et al.Exploiting burstiness in reviews for review spammer detection[C]//Seventh InternationalAAAI Conference on Weblogs and Social Media.Menlo Park:AAAI press,2013.
[11]LIM E P,NGUYEN V A,JINDAL N,et al.Detecting product review spammers using rating behaviors[C]//Proceedings of the 19th ACM International Conference on Information and Know-ledge Management.ACM,2010:939-948.
[12]XU C,ZHANG J.Combating product review spam campaignsvia multiple heterogeneous pairwise features[C]//Proceedings of the 2015 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2015:172-180.
[13]YE J,AKOGLU L.Discovering Opinion Spammer Groups by Network Footprints[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer,2015:97-97.
[14]LI H,CHEN Z,MUKHERJEE A,et al.Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns[C]//Ninth International AAAI Conference on Web and Social Media.AAAI,2015.
[15]SONG H X,YAN X,YU Z T,et al.Detection of Fake Reviews Based on Adaptive Clustering[J].Journal of Nanjing University(Natural Science),2013,49(4):433-438.(in Chinese)宋海霞,严馨,余正涛,等.基于自适应聚类的虚假评论检测[J].南京大学学报(自然科学版),2013,49(4):433-438.
[16]HUANG J,QIAN T,HE G,et al.Detecting Professional Spam Reviewers[M]//Advanced Data Mining and Applications.Berlin:Springer,2013:288-299.
[17]LI H,FEI G,SHAO W,et al.Bimodal Distribution and Co-Bursting in Review Spam Detection[C]//International Confe-rence on World Wide Web.International World Wide Web Conferences Steering Committee,2017:1063-1072.
[18]YE J,KUMAR S,AKOGLU L.Temporal opinion spam detection by multivariate indicative signals[C]//Tenth International AAAI Conference on Web and Social Media.AAAI,2016.
[19]WANG B,HUANG J,ZHENG H,et al.Semi-Supervised Recursive Autoencoders for Social Review Spam Detection[C]//International Conference on Computational Intelligence and Security.IEEE,2017:116-119.
[20]NARAYAN R,ROUT J K,JENA S K.Review Spam Detection Using Opinion Mining[C]//Progress in Intelligent Computing Techniques:Theory,Practice,and Applications.Singapore:Springer,2018:273-279.
[21]WANG G,XIE S,LIU B,et al.Review Graph Based Online Store Review Spammer Detection[C]//IEEE International Conference on Data Mining.IEEE,2011:1242-1247.
[22]WANG G,XIE S,LIU B,et al.Identify Online Store ReviewSpammers via Social Review Graph[J].ACM Transactions on Intelligent Systems & Technology,2012,3(4):1-21.
[23] WANG Z,LI Z,XU Y,et al.Detecting Product Review Spammers Based on Review Graphs [J].Computer Science,2014,41(10):295-299.(in Chinese)王琢,李准,徐野,等.基于评论图的虚假产品评论人的检测[J].计算机科学,2014,41(10):295-299.
[24]RAYANA S,AKOGLU L.Collective Opinion Spam Detection:BridgingReview Networks and Metadata[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:985-994.
[25]AKOGLU L,CHANDY R,FALOUTSOS C.Opinion fraud detection in online reviews by network effects[C]//Seventh International AAAI Conference on Weblogs and Social Media.2013.

相关文章 11

[1]	谭琪, 张凤荔, 王婷, 王瑞锦, 周世杰. 融入结构度中心性的社交网络用户影响力评估算法 Social Network User Influence Evaluation Algorithm Integrating Structure Centrality 计算机科学, 2021, 48(7): 124-129. https://doi.org/10.11896/jsjkx.200600096
[2]	谭琪, 张凤荔, 张志扬, 陈学勤. 社交网络用户影响力的建模方法 Modeling Methods of Social Network User Influence 计算机科学, 2021, 48(2): 76-86. https://doi.org/10.11896/jsjkx.191200102
[3]	袁禄, 朱郑州, 任庭玉. 虚假评论识别研究综述 Survey on Fake Review Recognition 计算机科学, 2021, 48(1): 111-118. https://doi.org/10.11896/jsjkx.200500101
[4]	张敏军, 华庆一. 基于概率矩阵分解算法的社交网络用户兴趣点个性化推荐 Personalized Recommendation of Social Network Users' Interest Points Based on ProbabilityMatrix Decomposition Algorithm 计算机科学, 2020, 47(12): 144-148. https://doi.org/10.11896/jsjkx.191000064
[5]	葛绍林, 叶剑, 何明祥. 基于深度森林的用户购买行为预测模型 Prediction Model of User Purchase Behavior Based on Deep Forest 计算机科学, 2019, 46(9): 190-194. https://doi.org/10.11896/j.issn.1002-137X.2019.09.027
[6]	李存燕, 洪玫. Github中开发人员的行为特征分析 Analysis on Behavior Characteristics of Developers in Github 计算机科学, 2019, 46(2): 152-158. https://doi.org/10.11896/j.issn.1002-137X.2019.02.024
[7]	杨超, 秦廷栋, 范波, 李涛. 基于人工免疫危险理论的微博水军用户检测研究 Study on Detection of Weibo Spammers Based on Danger Theory in Artificial Immunity System 计算机科学, 2018, 45(11): 138-142. https://doi.org/10.11896／j.issn.1002-137X.2018.11.020
[8]	徐文涛,刘锋,朱二周. 基于MapReduce的新型微博用户影响力排名算法研究 Research on Novel Ranking Algorithm of Microblog User’s Influence Based on MapReduce 计算机科学, 2016, 43(9): 66-70. https://doi.org/10.11896/j.issn.1002-137X.2016.09.012
[9]	许为,林柏钢,林思娟,杨旸. 基于多标签传播的社交网络用户影响力评估 Assessment of User Influence in Social Networks Based on Multi-label Propagation 计算机科学, 2016, 43(10): 135-140. https://doi.org/10.11896/j.issn.1002-137X.2016.10.025
[10]	杜楠,韩兰胜,付才,张忠科,刘铭. 基于相识度的恶意代码检测 Detection of Malware Code Based on Acquaintance Degree 计算机科学, 2015, 42(1): 187-192. https://doi.org/10.11896/j.issn.1002-137X.2015.01.042
[11]	雷庆,陈锻生,李绍滋. 复杂场景下的人体行为识别研究新进展 Advances on Human Action Recognition in Realistic Scenes 计算机科学, 2014, 41(12): 1-7. https://doi.org/10.11896/j.issn.1002-137X.2014.12.001

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed