Computer Science ›› 2024, Vol. 51 ›› Issue (9): 242-249.doi: 10.11896/jsjkx.230600117

• Artificial Intelligence • Previous Articles     Next Articles

Text-Image Gated Fusion Mechanism for Multimodal Aspect-based Sentiment Analysis

ZHANG Tianzhi1, ZHOU Gang1,2, LIU Hongbo1, LIU Shuo1, CHEN Jing1   

  1. 1 Information Engineering University,Zhengzhou 450001,China
    2 State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China
  • Received:2023-06-14 Revised:2024-02-23 Online:2024-09-15 Published:2024-09-10
  • About author:ZHANG Tianzhi,born in 1995,postgraduate.His main research interests include sentiment analysis and data mining.
    ZHOU Gang,born in 1974,Ph.D,professor.His main research interests include mass data processing and know-ledge grapgh.
  • Supported by:
    Science and Technology Research Program of Henan Province(222102210081).

Abstract: Multimodal aspect-based sentiment analysis is an emerging task in multimodal sentiment analysis field,which aims to identify the sentiment of each given aspect in text and image.Although recent research on multimodal sentiment analysis has made breakthrough progress,most existing models only use simple concatenation in multimodal feature fusion without considering whether there is semantically irrelevant information in image with text,which may introduce additional interference to the model.To address the above problems,this paper proposes a text-image gated gusion mechanism(TIGFM) model for multimodal aspect-based sentiment analysis,which introduces adjective-noun pairs(ANPs) extracted from the dataset images while text interacts with image,and treats the weighted adjectives as image auxiliary information.In addition,multimodal feature fusion is achieved by constructing a gating mechanism that dynamically controls the input of image and image auxiliary information in feature fusion stage.Experimental results demonstrate that TIGFM model achieves competitive results on two Twitter datasets,and then validate the effectiveness of proposed method.

Key words: Multimodal aspect-based sentiment analysis, Gated fusion mechanism, Adjective-noun pairs, Image auxiliary information, Semantic relevance

CLC Number: 

  • TP391
[1]CHEEMA G S,HAKIMOV S,MÜLLER-BUDACK E,et al.A fair and comprehensive comparison of multimodal tweet sentiment analysis methods[C]//Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding.2021:37-45.
[2]YUAN J L,DING Y Y,SHENG D M,et al.Image-Text Sentiment Analysis Model Based on Visual Aspect Attention[J].Computer Science,2022,49(1):219-224.
[3]XU N,MAO W,CHEN G.Multi-interactive memory networkfor aspect based multimodal sentiment analysis[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2019:371-378.
[4]YU J,JIANG J,XIA R.Entity-Sensitive Attention and Fusion Network for Entity-Level Multimodal Sentiment Classification[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2020,28:429-439.
[5]YU J,JIANG J.Adapting BERT for target-oriented multimodal sentiment classification[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence.2019:5408-5414.
[6]WANG J,LIU Z,SHENG V,et al.Saliencybert:Recurrent at-tention network for target-oriented multimodal sentiment classification[C]//Pattern Recognition and Computer Vision:4th Chinese Conference,PRCV 2021,Beijing,China,October 29,November 1,2021,Proceedings,Part III 4.Springer International Publishing,2021:3-15.
[7]BORTH D,JI R,CHEN T,et al.Large-scale visual sentimentontology and detectors using adjective noun pairs[C]//Procee-dings of the 21st ACM International Conference on Multimedia.2013:223-232.
[8]PORIA S,HAZARIKA D,MAJUMDER N,et al.Beneath thetip of the iceberg:Current challenges and new directions in sentiment analysis research[J].IEEE Transactions on Affective Computing,2020,14:108-132.
[9]PORIA S,CAMBRIA E,HAZARIKA D,et al.Context-depen-dent sentiment analysis in user-generated videos[C]//Procee-dings of the 55th Annual Meeting of the Association for Computational Linguistics(volume 1:Long papers).2017:873-883.
[10]LIANG P P,LIU Z,ZADEH A,et al.Multimodal LanguageAnalysis with Recurrent Multistage Fusion[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:150-161.
[11]BUSSO C,DENG Z,YILDIRIM S,et al.Analysis of emotionrecognition using facial expressions,speech and multimodal information[C]//Proceedings of the 6th International Conference on Multimodal Interfaces.2004:205-211.
[12]LEE C C,MOWER E,BUSSO C,et al.Emotion recognitionusing a hierarchical binary decision tree approach[J].Speech Communication,2011,53(9/10):1162-1171.
[13]CASTRO S,HAZARIKA D,PÉREZ-ROSAS V,et al.Towards Multimodal Sarcasm Detection(An _Obviously_ Perfect Paper)[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:4619-4629.
[14]CAI Y,CAI H,WAN X.Multi-modal sarcasm detection in twit-ter with hierarchical fusion model[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:2506-2515.
[15]YANG J,SHE D,SUN M,et al.Visual sentiment predictionbased on automatic discovery of affective regions[J].IEEE Transactions on Multimedia,2018,20(9):2513-2525.
[16]YANG J,SHE D,LAI Y K,et al.Weakly supervised coupled networks for visual sentiment analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7584-7592.
[17]KUMAR A,GARG G.Sentiment analysis of multimodal twitter data[J].Multimedia Tools and Applications,2019,78:24103-24119.
[18]KUMAR A,SRINIVASAN K,CHENG W H,et al.Hybrid con-text enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data[J].Information Processing & Management,2020,57(1):102141.
[19]ZHANG L,WANG S,LIU B.Deep learning for sentiment ana-lysis:A survey[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2018,8(4):e1253.
[20]PONTIKI M,GALANIS D,PAPAGEORGIOU H,et al.Semeval-2016 task 5:Aspect based sentiment analysis[C]//ProWorkshop on Semantic Evaluation(SemEval-2016).Association for Computational Linguistics,2016:19-30.
[21]DONG L,WEI F,TAN C,et al.Adaptive recursive neural network for target-dependent twitter sentiment classification[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(volume 2:Short papers).2014:49-54.
[22]XUE W,LI T.Aspect Based Sentiment Analysis with GatedConvolutional Networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Vo-lume 1:Long Papers).2018:2514-2523.
[23]MA Y,PENG H,CAMBRIA E.Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:5876-5883.
[24]MEŠKELČ D,FRASINCAR F.ALDONAr:A hybrid solutionfor sentence-level aspect-based sentiment analysis using a lexicalized domain ontology and a regularized neural attention mo-del[J].Information Processing & Management,2020,57(3):102211.
[25]ZHAO L,LIU Y,ZHANG M,et al.Modeling label-wise syntax for fine-grained sentiment analysis of reviews via memory-based neural model[J].Information Processing & Management,2021,58(5):102641.
[26]WANG K,SHEN W,YANG Y,et al.Relational Graph Atten-tion Network for Aspect-based Sentiment Analysis[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:3229-3238.
[27]ZHANG M,QIAN T.Convolution over hierarchical syntacticand lexical graphs for aspect level sentiment analysis[C]//Proceedings of the 2020 Conference on Empirical Methods in NaturalLanguage Processing(EMNLP).2020:3540-3549.
[28]XU H,LIU B,SHU L,et al.BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).2019:2324-2335.
[29]SUN C,HUANG L,QIU X.Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence[C]//Proceedings of NAACL-HLT.2019:380-385.
[30]KHAN Z,FU Y.Exploiting BERT for multimodal target sentiment classification through input space translation[C]//Proceedings of the 29th ACM International Conference on Multimedia.2021:3034-3042.
[31]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019.
[32]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding [C]//Proceedings of NAACL-HLT.2019:4171-4186.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[35]CHEN T,BORTH D,DARRELL T,et al.Deepsentibank:Vi-sual sentiment concept classification with deep convolutional neural networks[J].arXiv:1410.8586,2014.
[36]ZHAO F,WU Z,LONG S,et al.Learning from Adjective-Noun Pairs:A Knowledge-enhanced Framework for Target-Oriented Multimodal Sentiment Classification[C]//Proceedings of the 29th International Conference on Computational Linguistics.2022:6784-6794.
[37]WANG Y,HUANG M,ZHU X,et al.Attention-based LSTMfor aspect-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:606-615.
[38]FAN F,FENG Y,ZHAO D.Multi-grained attention network for aspect-level sentiment classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3433-3442.
[1] YU Bihui, TAN Shuyue, WEI Jingxuan, SUN Linzhuang, BU Liping, ZHAO Yiman. Vision-enhanced Multimodal Named Entity Recognition Based on Contrastive Learning [J]. Computer Science, 2024, 51(6): 198-205.
[2] ZHOU Xiao-shi, ZHANG Zi-wei, WEN Juan. Natural Language Steganography Based on Neural Machine Translation [J]. Computer Science, 2021, 48(11A): 557-564.
[3] . Relevance Feedback Algorithm Based on Fuzzy Semantic Relevance Matrix in Image Retrieval [J]. Computer Science, 2012, 39(Z6): 540-542.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!