计算机科学 ›› 2019, Vol. 46 ›› Issue (6A): 74-79.

• 智能计算 • 上一篇    下一篇

基于LSTM和多特征组合的电影评论专业程度分类

吴璠, 李寿山, 周国栋   

  1. 苏州大学计算机科学与技术学院 江苏 苏州215006
  • 出版日期:2019-06-14 发布日期:2019-07-02
  • 通讯作者: 李寿山(1980-),男,教授,主要研究方向为自然语言处理,E-mail:lishoushan@suda.edu.cn
  • 作者简介:吴 璠(1994-),女,硕士,CCF会员,主要研究方向为自然语言处理;周国栋(1967-),男,教授,主要研究方向为自然语言处理。
  • 基金资助:
    本文受国家自然科学基金(61331011,61672366)资助。

Movie Review Professionalism Classification Using LSTM and Features Fusion

WU Fan, LI Shou-shan, ZHOU Guo-dong   

  1. Institute of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Online:2019-06-14 Published:2019-07-02

摘要: 社交网络上的电影评论通常既有专业评论家写的专业评论,也有普通观众写的非专业评论,区分网络电影评论是否为专业评论对于电影质量评估有着重要的价值。由于电影评论属于短文本,用词不规范,特征稀疏,因此传统的文本特征选择方法以及传统的分类模型并不能完全适用于电影评论专业程度的分类。为此,文中主要研究基于神经网络模型电影评论的专业程度分类,即判断其是专业评论还是非专业评论。首先通过基于神经网络的LSTM模型学习不同特征的表示,包括基于词的表示、基于词性的表示,以及基于依存关系的表示,然后通过融合不同特征表示来学习和捕捉有效的文本特征,从而帮助评论专业程度分类。该方法在美国著名的影评网站烂番茄网(Rotten Tomatoes)数据集上进行实验,实验结果表明,在融合了词性和依存关系特征的模型的分类正确率达到了88.30%,比仅使用词特征的基准模型提高了3.66%。这说明在模型中引入词性特征、依存关系特征能够有效提升评论专业程度分类的效果。

关键词: LSTM, SVM, 多特征组合, 评论专业程度分类, 神经网络

Abstract: Movie Reviews on social networks usually include professional reviews written by professional critics,as well as non-professional reviews written by ordinary audience,and it is of great value to distinguish whether online film reviews are professional reviews for film quality evaluation.Due to the fact that film review is a short text book with irregular words and sparse features,the traditional text feature selection method and traditional classification model cannot fully apply to the classification of film review’s professional level.Therefore,the paper mainly studied movie review professionalism classification based on neural network model,that is judging whether it is professional review or non-professional review.The representation of different features is learned through neural network-based LSTM model,including word-based representation,part-of-speech representation,and representation based on dependencies,and valid text features are learned and captured by fusing different feature representations to help review professionalism classification.The method was experimented on the Rotten Tomatoes dataset of the famous American film review website.The experimental results show that the classification accuracy rate of the model combining part-of-speech and dependency is 88.30%,which is 3.66% higher than the benchmark model only using word features.This shows that the method of introducing part-of-speech features and dependency features into the model can effectively improve the effectiveness of professional classification of reviews.

Key words: LSTM, Multi-feature fusion, Neural networks, Review professionalism classification, SVM

中图分类号: 

  • TP391
[1]PANG T B,PANG B,LEE L.Thumbs up? Sentiment Classification using Machine Learning[C]∥Empirical Methods in Na-tural Language Processing.2002:79-86.
[2]TURNEY P.Thumbs Up or Thumbs Down? Senmantic Orientation Applied to Unsupervised Classification of Reviews[C]∥Proceedings of ACL-02.2002:79-86.
[3]TENG Z,VO D T,ZHANG Y.Context-Sensitive Lexicon Fea-tures for Neural Sentiment Analysis[C]∥Conference on Empi-rical Methods in Natural Language Processing.2016:1629-1638.
[4]QIAN Q,HUANG M,LEI J,et al.Linguistically Regularized LSTMs for Sentiment Classification [C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1679-1689.
[5]刘志明,刘鲁.基于机器学习的中文微博情感分类实证研究[J].计算机工程与应用,2012,48(1):1-4.
[6]PONG-INWONG C,SONGPAN W.Sentiment analysis in teaching evaluations using sentiment phrase pattern matching(SPPM) based on association mining[J].International Journal of Machine Learning & Cybernetics,2018,6:1-10.
[7]CHANG S,JUNZHONG J I.Text Sentiment Classification Algorithm Based on Double Channel Convolutional Neural Network [J].Pattern Recognition & Artificial Intelligence,2018,31(2):158-166.
[8]LIU J.Low-quality product review detection in opinion summarization[C]∥Proc.Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.2007:334-342.
[9]O’MAHONY M P,SMYTH B.Learning to recommend helpful hotel reviews[C]∥ACM Conference on Recommender Systems.ACM,2009:305-308.
[10]WEIMER M,GUREVYCH I.Automatically assessing the post quality in online discussions on software[C]∥Meeting of the ACL on Interactive Poster and Demonstration Sessions.Association for Computational Linguistics.2007:125-128.
[11]SIERSDORFER S,CHELARU S,NEJDL W,et al.How useful are your comments?:analyzing and predicting you tube comments and comment ratings[C]∥International Conference on World Wide Web.ACM,2010:891-900.
[12]LU Y,TSAPARAS P,NTOULAS A,et al.Exploiting social context for review quality prediction[C]∥International Confe-rence on World Wide Web.ACM,2010:691-700.
[13]ZENG Y C,KU T,CHEN L P,et al.Modeling the Helpful Opinion Mining of Online Consumer Reviews as a Classification Problem[J].中文计算语言学期刊,2014,19(2):17-31.
[14]YANG Y,QIU M,YAN Y,et al.Semantic analysis and helpfulness prediction of text for online product reviews[C]∥IEEE International Conference on Parallel and Distributed Systems.IEEE,2015:552-559.
[15]MENG Y,WANG H,ZHENG L.Impact of online word-of-mouth on sales:the moderating role of product review quality [J].New Review in Hypermedia & Multimedia,2018,11:1-27.
[16]ZHANG J,LIN Y,HUANG T,et al.Evaluating Review’s Quality Based on Review Content and Reviewer’s Expertise[C]∥International Conference on Database Systems for Advanced Applications.Springer,Cham,2018:36-47.
[17]SIERING M,MUNTERMANN J,RAJAGOPALAN B.Explaining and predicting online review helpfulness:The role of content and reviewer-related signals [J].Decision Support Systems,2018,108:1-12.
[18]YANG Y,CHEN C,BAO F S.Aspect-Based Helpfulness Prediction for Online Product Reviews[C]∥International Confe-rence on TOOLS with Artificial Intelligence.IEEE,2017:836-843.
[19]XIONG W,LITMAN D.Automatically predicting peer-review helpfulness[C]∥Meeting of the Association for Computational Linguistics:Human Language Technologies:Short Papers.Association for Computational Linguistics,2011:502-507.
[20]HUANG Q,CHEN R,ZHENG X,et al.Deep Sentiment Representation Based on CNN and LSTM[C]∥International Confe-rence on Green Informatics.IEEE,2017:30-33.
[21]LU C,HUANG H,JIAN P,et al.A P-LSTM Neural Network for Sentiment Classification [C]∥In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining.2017:524-533.
[22]WANG Y,HUANG M,ZHU X,et al.Attention-based LSTM for Aspect-level Sentiment Classification[C]∥Conference on Empirical Methods in Natural Language Processing.2017:606-615.
[23]SHUANG K,REN X,CHEN J,et al.Combining Word Order and CNN-LSTM for Sentence Sentiment Classification[C]∥International Conference.2017:17-21.
[1] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 王润安, 邹兆年.
基于物理操作级模型的查询执行时间预测方法
Query Performance Prediction Based on Physical Operation-level Models
计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[7] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 齐秀秀, 王佳昊, 李文雄, 周帆.
基于概率元学习的矩阵补全预测融合算法
Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning
计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[12] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[13] 杨炳新, 郭艳蓉, 郝世杰, 洪日昌.
基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用
Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition
计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
[14] 刘卫明, 安冉, 毛伊敏.
基于聚类和WOA的并行支持向量机算法
Parallel Support Vector Machine Algorithm Based on Clustering and WOA
计算机科学, 2022, 49(7): 64-72. https://doi.org/10.11896/jsjkx.210500040
[15] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!