计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 69-74.doi: 10.11896/jsjkx.190700034
冯进展, 蔡淑琴
FENG Jin-zhan, CAI Shu-qin
摘要: 由于无法预知产品在线评论的文本内容是否对浏览者有用,大量的无用评论增加了潜在消费者的信息搜索成本,甚至降低了潜在消费者购买产品的可能性。为提高电子商务平台的有用在线评论率,为撰写评论者提供测试功能,建立在线评论有用程度预测模型。根据在线评论的文本特征,所提模型选择在线评论的词语数量、词语的有用值、产品特征数量等3个特征,构建一个预测在线评论有用程度的模型,其中词语的有用值是词语区分在线评论有用程度的信息增益量,然后根据大量在线评论数据利用梯度下降算法解出模型参数。实验结果显示,随着词语数量、词语有用值、产品特征数量的增长,评论有用程度不断提高。实验中把在线评论分为一般、有用、非常有用3个程度,对于一般的在线评论,预测精确率为92.96%;对于“有用”在线评论,预测精确率为94.83%;对于“非常有用”在线评论,预测精确率为67.63%。实验对模型性能进行测试,得到平均精确率为85.05%,召回率为82.81%,F1值为83.72%,该结果验证了所提模型预测在线评论有用程度的可行性。
中图分类号:
[1]MIN H J,PARK J C.Identifying helpful reviews based on customers mentions about experiences[J].Expert Systems with Applications,2012,39(15):11830-11838. [2]SHAN Y.How credible are online product reviews? The effects of self-generated and system-generated cues on source credibility evaluation[J].Computers in Human Behavior,2016,55:633-641. [3]PENG L,ZHOU Q H,QIU J T.Research on the Model ofHelpfulness Factors of OnlineCustomer Reviews[J].Computer Science,2011,38(8):205-207. [4]PAN Y,ZHANG J Q.Born Unequal:A Study of the Helpfulness of User-Generated Product Reviews[J].Journal of Retailing,2011,87(4):598-612. [5]FILIERI R.What makes an online consumer review trustwor-thy?[J].Annals of Tourism Research,2016,58:46-64. [6]HOMER P M.Message Framing and the Interrelationshipsamong Ad-Based Feelings,Affect,and Cognition[J].Journal of Advertising,1992,21(1):19-33. [7]WU T Y,LIN C A.Predicting the effects of eWOM and online brand messaging:Source trust,bandwagon effect and innovation adoption factors[J].Telematics & Informatics,2017,34(2):470-480. [8]WANG H W,MENG Y.Helpful Features Identification of Online Reviews Quality on GBDT Feature Contribution[J].Journal of Chinese Information Processing,2017,31(3):109-117. [9]LI C,XIANG J,XIANG J.Assessment method of credibility on online product reviews[J].Journal of Computer Applications,2019,39(1):187-191. [10]HU X G,CHEN F X,ZHANG Y H.Research on impact factors of online reviews’helpfulness based on product reviews data[J].Application Research of Computers,2016,33(12):3559-3561. [11]SINGH J P,IRANI S,RANA N P,et al.Predicting the “helpfulness” of online consumer reviews[J].Journal of Business Research,2017,70(1):346-355. [12]LEE S,CHOEH J Y.Predicting the helpfulness of onlinereviews using multilayer perceptron neural networks[J].Expert Systems with Applications,2014,41(6):3041-3046. [13]SINGH J P,IRANI S,RANA N P,et al.Predicting the “helpfulness” of online consumer reviews[J].Journal of Business Research,2017,70:346-355. [14]PARK Y J.Predicting the Helpfulness of Online Customer Reviews across Different Product Types[J].Sustainability,2018,10(6):1735. [15]KRISHNAMOORTHY S.Linguistic features for review helpfulness prediction[J].Expert Systems with Applications,2015,42(7):3751-3759. [16]JIANG W,ZHANG L,DAI Y,et al.Analyzing Helpfulness of Online Reviews for User Requirements Elictation[J].Chinese Journal of Computers,2013,36(1):119-131. [17]QIU J P.Information Metrology (5) Lecture 5:The Law of Frequency Distribution of DocumentInformation Words-Zipf's Law[J].Information Studies:Theory& Application,2000(5):77-81. [18]ZHANG Y H,LI Z W,ZHAO J C.How the Information Quality Affects the Online Review Usefulness?-An Emprical Analysis Based on Taobao Reciew Data[J].Chinese Journal of Management,2017,14(1):77-85. [19]WANG Z H,JIANG W.Online Reviews Sentiment AnalysisModel Based on Rough Sets[J].Computer Engineering,2012,38(16):1-4. [20]YU M Z,NARISA Z.Feature extraction method based on mutual self-expanding mode[J].Application Research of Computers,2017,34(4):977-980. [21]XU Q,ZHANG X,YU S H,et al.Multi-feature-based classification method using random forest and superpixels for polarimetric SAR images[J].Journal of Remote Sensing,2019,23(4):685-694. |
[1] | 胡艳梅, 杨波, 多滨. 基于网络结构的正则化逻辑回归 Logistic Regression with Regularization Based on Network Structure 计算机科学, 2021, 48(7): 281-291. https://doi.org/10.11896/jsjkx.201100106 |
[2] | 赵志强, 易秀双, 李婕, 王兴伟. 基于GR-AD-KNN算法的IPv6网络DoS入侵检测技术研究 Research on DoS Intrusion Detection Technology of IPv6 Network Based on GR-AD-KNN Algorithm 计算机科学, 2021, 48(6A): 524-528. https://doi.org/10.11896/jsjkx.200500001 |
[3] | 杨力, 李欣宇, 石怀峰, 潘成胜. 空间信息网络任务智能识别方法 Task Intelligent Identification Method for Spatial Information Network 计算机科学, 2020, 47(4): 262-269. https://doi.org/10.11896/jsjkx.190300111 |
[4] | 刘晓彤,王伟,李泽禹,沈思婉,姜小明. 基于改进BP神经网络的尿液中红白细胞识别算法 Recognition Algorithm of Red and White Cells in Urine Based on Improved BP Neural Network 计算机科学, 2020, 47(2): 102-105. https://doi.org/10.11896/jsjkx.191100195 |
[5] | 杨烽. 利用粒计算的符号型数据分组算法 Symbolic Value Partition Algorithm Using Granular Computing 计算机科学, 2018, 45(11A): 445-452. |
[6] | 李虹利, 蒙祖强. 运用信息增益和不一致度进行填补的属性约简算法 Attribute Reduction Algorithm Using Information Gain and Inconsistency to Fill 计算机科学, 2018, 45(10): 217-224. https://doi.org/10.11896/j.issn.1002-137X.2018.10.040 |
[7] | 姜芳,李国和,岳翔. 基于语义的文档特征提取研究方法 Semantic-based Feature Extraction Method for Document 计算机科学, 2016, 43(2): 254-258. https://doi.org/10.11896/j.issn.1002-137X.2016.02.053 |
[8] | 李 玲,刘华文,徐晓丹,赵建民. 基于信息增益的多标签特征选择算法 Multi-label Feature Selection Algorithm Based on Information Gain 计算机科学, 2015, 42(7): 52-56. https://doi.org/10.11896/j.issn.1002-137X.2015.07.012 |
[9] | 罗惠,郭斌,於志文,王柱,封云. 基于网络拓扑和地理特征融合的朋友关系预测模型 Friendship Prediction Based on Fusion of Network Topology and Geographical Features 计算机科学, 2014, 41(6): 43-47. https://doi.org/10.11896/j.issn.1002-137X.2014.06.009 |
[10] | 翟军昌,秦玉平,车伟伟. 垃圾邮件过滤中信息增益的改进研究 Improvement of Information Gain in Spam Filtering 计算机科学, 2014, 41(6): 214-216. https://doi.org/10.11896/j.issn.1002-137X.2014.06.042 |
[11] | 胡文军,王娟,王培良,王士同. 适合大样本的线性SVMs快速集成模型 Fast Model of Ensembling Linear Support Vector Machines Suitable for Large Datasets 计算机科学, 2014, 41(5): 245-249. https://doi.org/10.11896/j.issn.1002-137X.2014.05.052 |
[12] | 邵杰,杜丽娟,杨静宇. XCSG在多机器人强化学习中的应用 Applications of XCSG in Multi-robot Reinforcement Learning 计算机科学, 2013, 40(8): 249-251. |
[13] | 唐磊,李春平,杨柳. 统计策略序列模式挖掘及其在软件缺陷预测中的应用 Statistically Significant Sequential Pattern Mining Applying to Software Defect Prediction 计算机科学, 2013, 40(5): 164-167. |
[14] | 任永功,杨雪,杨荣杰,胡志冬. 基于信息增益特征关联树的文本特征选择算法 Text Feature Selection Methods Based on Information Gain and Feature Relation Tree 计算机科学, 2013, 40(10): 252-256. |
[15] | 于海涛,贾美娟,王慧强,邵国强. 基于人工鱼群的优化K-means聚类算法 K-means Clustering Algorithm Based on Artificial Fish Swarm 计算机科学, 2012, 39(12): 60-64. |
|