计算机科学 ›› 2018, Vol. 45 ›› Issue (8): 213-217.doi: 10.11896/j.issn.1002-137X.2018.08.038

• 人工智能 • 上一篇    下一篇

用于情感分类的双向深度LSTM

曾蒸1, 李莉2, 陈晶3   

  1. 重庆师范大学新闻与传媒学院 重庆4013311
    西南大学计算机与信息科学学院 重庆4007152
    中冶赛迪工程技术股份有限公司BIM中心 重庆4011223
  • 收稿日期:2018-05-02 出版日期:2018-08-29 发布日期:2018-08-29
  • 作者简介:曾 蒸(1972-),女,硕士,工程师,主要研究方向为大数据与深度学习、智能教育技术、计算机网络及应用、新媒体运营等,E-mail:akk310@qq.com(通信作者); 李 莉(1967-),女,博士,教授,博士生导师,主要研究方向为机器学习、数据挖掘与分析、服务计算、智能教育,E-mail:lily@swu.edu.cn。
  • 基金资助:
    本文受国家自然科学基金项目(61170192),重庆市科委重点项目(cstc2017zdcy-zdyf0366),重庆市教委项目(113143)资助。

Deeply Hierarchical Bi-directional LSTM for Sentiment Classification

ZENG Zheng1, LI Li2, CHEN Jing3   

  1. College of Journalism and Communication,Chongqing Normal University,Chongqing 401331,China1
    College of Computer and Information Science,Southwest University,Chongqing 400715,China2
    BIM Center,CSDI Engineering Co.,LTD.,Chongqing 401122,China3
  • Received:2018-05-02 Online:2018-08-29 Published:2018-08-29

摘要: 对商品、电影等的评论的体现人们对商品的喜好程度,从而为意向购买该商品的人提供参考,也有助于商家调整橱窗货品以取得最大利润。近年来,深度学习在文本上强大的表示和学习能力为理解文本语义、抓取文本所蕴含的情感倾向提供了极好的支持,特别是深度学习中的长短记忆模型(Long Short-Term Memory,LSTM)。评论是一种时序数据形式,通过单词前向排列来表达语义信息。而LSTM恰好是时序模型,可以前向读取评论,并把它编码到一个实数向量中,该向量隐含了评论的潜在语义,可以被计算机存储和处理。利用两个LSTM模型分别从前、后两个方向读取评论,从而获取评论的双向语义信息;再通过层叠多层双向LSTM来达到获取评论深层特征的目的;最后把这个模型放到一个情感分类模型中,以实现情感分类任务。实验证明,该模型相对基准LSTM取得了更好的实验效果,这表示双向深度LSTM能抓取更准确的文本信息。将双向深度LSTM模型和卷积神经网络(Convolutional Neural Network,CNN)进行实验对比,结果表明双向深度LSTM模型同样取得了更好的效果。

关键词: LSTM, 情感分类, 深度学习

Abstract: The comments on goods,films and others contribute to assess people’s preference degree for goods,which provides reference for the people who intend to buy the goods,and can help businesses adjust shelves to maximize pro-fits.In recent years,the powerful representation and learning ability in deep learning technologies provides a good support for understanding text semantics and grasping the emotional tendency of texts,especially the long short-term me-mory (LSTM) model in deep learning.The comment is a form of temporal data,which expresses semantic information through the forward arrangement of words.LSTM is a sequential model that reads the comment forward and encodes it into a real vector,and this vector implies the potential semantics of the comment and can be stored and processed by the computer.In this paper,two LSTM models are utilized to read comments from forward and backward directions respectively,and thus the two-way semantic information of the review can be obtained.Then the purpose of obtaining the deep features of comments is achieved by stacking the multilayer bidirectional LSTM.Finally,the model is put into a sentimental classification model to implement the sentiment classification.Experimental results show that the proposed method outperforms baseline LSTM,which means that deeply hierarchical bi-directional LSTM (DHBL) can capture more accurate text information.Compared with the convolutional neural network (CNN) model,the proposed model also achieves better effect.

Key words: Deep learning, LSTM, Sentiment classification

中图分类号: 

  • TP181
[1]IRSOY O,CARDIE C.Opinion Miningwith Deep RecurrentNeural Networks[C]∥Conference on Empirical Methods in Natural Language Processing.2014:720-728.
[2]ZHOU X,WAN X,XIAO J.CMiner:Opinion Extraction andSummarization forChinese Microblogs[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(7):1650-1663.
[3]WEJ J,RUAN H,LI Z.Analysis of economic impact of onlinereviews:an approach for market-driven requirements evolution[M]∥Requirements.Springer Berlin Heidelberg:Engineering,2014:45-59.
[4]SONG G,YE Y,DU X,et al.Short text classification:A survey[J].Journal of Multimedia,2014,9(5):635-643.
[5]WANG B K,HUANG Y F,YANG W X,et al.Short text classification based on strong feature thesaurus[J].Frontiers of Information Technology & Electronic Engineering,2012,13(9):649-659.
[6]KIM K,CHUNG B S,CHOI Y,et al.Language independent semantic kernels for short-text classification[J].Expert Systems with Applications,2014,41(2):735-743.
[7]WANG M,LIN L,WANG F.Improving Short Text Classification through Better Feature Space Selection[C]∥International Conference on Computational Intelligence and Security.IEEE,2014:120-124.
[8]FAN X,HU H.Construction of High-quality Feature Extension Mode Library for Chinese Short-text Classification[C]∥WASE International Conference on Information Engineering.IEEE,2010:87-90.
[9]ZHANG X,WU B.ShortText Classification based on featureextension using The N-Grammodel[C]∥International Confe-rence on Fuzzy Systems & Knowledge Discovery.IEEE,2016:710-716.
[10]HUANG P S,HE X,GAO J,et al.Learning deep structured semantic models for web search using clickthrough data[C]∥Acm International Conference on Conference on Information & Knowledge Management.ACM,2013:2333-2338.
[11]SHEN Y,HE X,GAO J,et al.Alatent semantic model with con-volutional-pooling structure for information retrieval[C]∥International Conference on Conference on Information and Knowledge Management.ACM,2014:101-110.
[12]KIM Y.Convolutional neural networks for sentence classification.arXiv preprint arXiv:1408.5882,2014.
[13]MIKOLOV T,KARAFIÁT M,BURGET L,et al.Recurrentneural network based language model[C]∥INTERSPEECH 2010,Conference of the International Speech Communication Association.DBLP,2010:1045-1048.
[14]MIKOLOV T.Statistical language models based on neural networks[OL].https://pdfs.semanticscholar.org/e753/714f98099e3da1e96c652d34cc45e315ad23.pdf. Accessed on Jan.2018.
[15]WILLIAMS R J,ZIPSER D.Gradient-based learning algorithms for recurrent networks and their computational complexity.Backpropagation:Theory,Architectures,Andapplications,1995,1:433-486.
[16]HOCHREITER S.The Vanishing Gradient Problem DuringLearning Recurrent Neural Nets and Problem Solutions[J].International Journal of Uncertainty,Fuzzinessand Knowledge-Based Systems,1998,6(2):107-116.
[17]GUSTAVSSON A,MAGNUSON A,BLOMBERG B,et al.On the difficulty of training recurrent neural networks[J].Compu-ter Science,2013,52(3):337-345.
[18]HOCHREITER S,SCHMIDHUBER J.Long short-term memory.Neural Computation,1997,9(8):1735-1780.
[19]OLAH C.Understanding LSTMNetworks[OL].http://colah.github.io/posts/2015-08-Understanding-LSTMs.
[20]GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM networks[C]∥IEEE International Joint Conference on Neural Networks,2005(IJCNN’05).IEEE,2005:2047-2052.
[21]LIN M,CHEN Q,YAN S.Network In Network[J].arXiv.preprint arXiv?13120.4400,2013.
[22]WU Y,SCHUSTER M,CHEN Z,et al.Google’s Neural Ma-chine Translation System:Bridging the Gap between Human and Machine Translation[J].arXiv preprint arXiv:1609.08144,2016.
[23]HAWKINS D M.The Problem of Overfitting[J].Cheminform,2004,35(19):1-12.
[24]HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-223.
[25]MAAS A L,DALY R E,PHAM P T,et al.Learning word vectors for sentiment analysis[C]∥Meeting of the Association for Computational Linguistics:Human Language Technologies.Association for Computational Linguistics,2011:142-150.
[26]NAKOV P,RITTER A,Rosenthal S,et al.SemEval-2016 Task 4:Sentiment Analysis in Twitter[C]∥International Workshop on Semantic Evaluation.2016:1-18.
[27]ANOOP V S,PREM S C.Generating and visualizing topichiera-rchies from microblogs:An iterative latent dirichlet allocation approach[C]∥2015 International Conference on Advances in Computing Communications and Informatics(ICACCI).2015:824-828.
[28]LE Q V,MIKOLOV T.Distributed Representations of Sen-tences and Documents[J].ICML,2014,4(2):1188-1196.
[29]DUCHI J,HAZAN E,SINGER Y.Adaptive subgradient methods for online learning and stochastic optimization[J].Journal of Machine Learning Research,2011,12(7):257-269.
[30]ZEILER M D.Adadelta:an adaptive learning rate method[J].arXiv preprint arXiv:1212.5701,2012.
[31]KINGMA D,BA J.Adam:amethod for stochastic optimization[J].arXiv preprint arXiv:1412.6980,2014.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[9] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 王君锋, 刘凡, 杨赛, 吕坦悦, 陈峙宇, 许峰.
基于多源迁移学习的大坝裂缝检测
Dam Crack Detection Based on Multi-source Transfer Learning
计算机科学, 2022, 49(6A): 319-324. https://doi.org/10.11896/jsjkx.210500124
[15] 楚玉春, 龚航, 王学芳, 刘培顺.
基于YOLOv4的目标检测知识蒸馏算法研究
Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4
计算机科学, 2022, 49(6A): 337-344. https://doi.org/10.11896/jsjkx.210600204
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!