Computer Science ›› 2019, Vol. 46 ›› Issue (11): 181-185.doi: 10.11896/jsjkx.181001941

• Artificial Intelligence • Previous Articles     Next Articles

Multi-modal Sentiment Analysis with Context-augmented LSTM

LIU Qi-yuan, ZHANG Dong, WU Liang-qing, LI Shou-shan   

  1. (School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Received:2018-10-18 Online:2019-11-15 Published:2019-11-14

Abstract: In recent years,multi-modal sentiment analysis has become an increasingly popular research area,which extends traditional text-based sentiment analysis to a multi-modal level that combines text,images and sound.Multi-modal sentiment analysis usually requires the acquisition of independent information within a single modality and interactive information between different modalities.In order to use the context information of language expression in each modality to obtain these two kinds of information,a multi-modal sentiment analysis approach based on context-augmented LSTM was proposed.Specifically,each modality is encoded in combination with the context feature using LSTM which aims to capture the independent information within single modality firstly.Subsequently,the independent information of multi-modality is merged,and the other LSTM layer is utilized to obtain the interactive information between the different modalities to form a multi-modal feature representation.Finally,the max-pooling strategy is used to reduce the dimension of the multi-modal representation,which will be fed to the sentiment classifier.The method achieves 75.3% ACC on the MOSI data set and F1 reaches 74.9.Compared to traditional machine learning methods such as SVM,ACC is 8.1% higher and F1 is 7.3 higher.Compared with the current advanced deep learning method,it is 0.9% higher on ACC and 1.3 higher on F1.At the same time,the trainable parameters are reduced by about 20 times,and the training speed is increased by 10 times.The experimental results demonstrate that the performance of the proposed approach significantly outperforms the competitive multi-modal sentiment classification baselines.

Key words: Context enhancement, Multi-modal, Sentiment analysis

CLC Number: 

  • TP391
[1]MORENCY L P,MIHALCEA R,DOSHI P.Towards Multimodal Sentiment Analysis:Harvesting Opinions from the Web[C]∥Proceedings of International Conference on Multimodal Interfaces.ACM,2011:169-176.
[2]BUSSO C,BULUT M,LEE C C,et al.Iemocap:Interactive emotional dyadic motion capture database [J].Journal of Language Resources and Evaluation,2008,42(4):335-359.
[3]PARK S,SHIM H S,CHATTERJEE M,et al.Computationalanalysis of persuasiveness in social multimedia:A novel dataset and multimodal prediction approach[C]∥Proceedings of the 16th International Conference on Multimodal Interaction.New York:ACM,2014:50-57.
[4]PORIA S,CAMBRIA E,GELBUKH A F.Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-Level Multimodal Sentiment Analysis[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2015:2539-2544.
[5]PORIA S,CAMBRIA E,HAZARIKA D,et al.Context-dependent sentiment analysis in user-generated videos[C]∥Procee-dings of the 55th ACL.2017:873-883.
[6]PORIA S,CHATURVEDI I,CAMBRIA E,et al.Convolutional MKL based multimodal emotion recognition and sentiment analy-sis[C]∥IEEE 16th ICDM.Piscataway,NJ:IEEE,2016:439-448.
[7]NOJAVANASGHARI B,GOPINATH D,KOUSHIK J,et al.Deep Multimodal Fusion for Persuasiveness Prediction[C]∥Proceedings of International Conference on Multimodal Interaction.ACM,2016:284-288.
[8]ZADEH A,CHEN M,PORIA S,et al.Tensor Fusion Network for Multimodal Sentiment Analysis[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Proces-sing.Stroudsburg,PA:ACL,2017:1103-1114.
[9]DIANE J L,KATE F R.Predicting student emotions in computer-human tutoring dialogues[C]∥Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics.Stroudsburg,PA:ACL,2004:351.
[10]EYBEN F,WOLLMER M,GRAVES A,et al.On-line emotion recognition in a 3-d activation-valence-time continuum using acoustic and linguistic cues[J].Journal on Multimodal User Interfaces,2010,3(1/2):7-19.
[11]WOLLMER M,WENINGER F,KNAUP T,et al.Youtubemovie reviews:Sentiment analysis in an audio-visual context[J].IEEE Intelligent Systems,2013,28(3):46-53.
[12]WANG H,MEGHAWAT A,MORENCY L P,et al.Select-Additive Learning:Improving Cross-individual Generalization in Multimodal Sentiment Analysis[J].arXiv:1609.05244.
[13]GU Y,CHEN S H,MARSIC I.Deep multimodal learning foremotion recognition on spoken language[C]∥2018 IEEE International Conference Proceedings of Speech and Signal Processing (ICASSP).Piscataway,NJ:IEEE,2018.
[14]ZADEH A,LANG P P,VANBRIESEN J,et al.MultimodalLanguage Analysis in the Wild:CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph[C]∥Proceedings of the Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2018:2236-2246.
[15]CHEN M,WANG S,LIANG P P,et al.Multimodal Sentiment Analysis with Word-level Fusion and Reinforcement Learning[C]∥Proceedings of International Conference on Multimodal Interaction.ACM,2017:163-171.
[16]ZADEH A,ZELLERS R,PINCUS E,et al.Mosi:Multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos[J].arXiv:1606.0659.
[17]PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2014:1532-1543.
[18]DEGOTTEX G,KANE J,DRUGMAN T,et al.COVAREP — A Collaborative Voice Analysis Repository for Speech Technologies[C]∥Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2014:960-964.
[19]ZADEH A,LIANG P P,PORIA S,et al.Multi-attention Recurrent Network for Human Communication Comprehension[C]∥Proceedings of the AAAI Conference on Artificial Intelligence.Menlopark,CA:AAAI,2018.
[20]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980.
[21]ZADEH A,ZELLERS R,PINCUS E,et al.Multimodal Sentiment Intensity Analysis in Videos:Facial Gestures and Verbal Messages[J].IEEE Intelligent Systems,2016,31(6):82-88.
[22]HO T K.The Random Subspace Method for Constructing Decision Forests[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1998,20(8):832-844.
[23]HOCHREITER S,SCHMIDHUBER J.Long Short-term Memory[J].Neural Computation,1997,9(8):1735-1780.
[1] ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[2] DING Feng, SUN Xiao. Negative-emotion Opinion Target Extraction Based on Attention and BiLSTM-CRF [J]. Computer Science, 2022, 49(2): 223-230.
[3] YUAN Jing-ling, DING Yuan-yuan, SHENG De-ming, LI Lin. Image-Text Sentiment Analysis Model Based on Visual Aspect Attention [J]. Computer Science, 2022, 49(1): 219-224.
[4] HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258.
[5] LIU Chuang, XIONG De-yi. Survey of Multilingual Question Answering [J]. Computer Science, 2022, 49(1): 65-72.
[6] ZHOU Xin-min, HU Yi-gui, LIU Wen-jie, SUN Rong-jun. Research on Urban Function Recognition Based on Multi-modal and Multi-level Data Fusion Method [J]. Computer Science, 2021, 48(9): 50-58.
[7] DAI Hong-liang, ZHONG Guo-jin, YOU Zhi-ming , DAI Hong-ming. Public Opinion Sentiment Big Data Analysis Ensemble Method Based on Spark [J]. Computer Science, 2021, 48(9): 118-124.
[8] ZHANG Jin, DUAN Li-guo, LI Ai-ping, HAO Xiao-yan. Fine-grained Sentiment Analysis Based on Combination of Attention and Gated Mechanism [J]. Computer Science, 2021, 48(8): 226-233.
[9] SHI Wei, FU Yue. Microblog Short Text Mining Considering Context:A Method of Sentiment Analysis [J]. Computer Science, 2021, 48(6A): 158-164.
[10] PAN Fang, ZHANG Hui-bing, DONG Jun-chao, SHOU Zhao-yu. Aspect Sentiment Analysis of Chinese Online Course Review Based on Efficient Transformer [J]. Computer Science, 2021, 48(6A): 264-269.
[11] WU A-ming, JIANG Pin, HAN Ya-hong. Survey of Cross-media Question Answering and Reasoning Based on Vision and Language [J]. Computer Science, 2021, 48(3): 71-78.
[12] WANG Shu-hui, YAN Xu, HUANG Qing-ming. Overview of Research on Cross-media Analysis and Reasoning Technology [J]. Computer Science, 2021, 48(3): 79-86.
[13] WANG Li-fang, WANG Rui-fang, LIN Su-zhen, QIN Pin-le, GAO Yuan, ZHANG Jin. Multimodal Medical Image Fusion Based on Dual Residual Hyper Densely Networks [J]. Computer Science, 2021, 48(2): 160-166.
[14] LI Jian-lan, PAN Yue, LI Xiao-cong, LIU Zi-wei, WANG Tian-yu. Chinese Commentary Text Research Status and Trend Analysis Based on CiteSpace [J]. Computer Science, 2021, 48(11A): 17-21.
[15] XU Lin-hong, LIU Xin, YUAN Wei, QI Rui-hua. Construction and Application of Russian Multimodal Emotion Corpus [J]. Computer Science, 2021, 48(11): 312-318.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!