计算机科学 ›› 2026, Vol. 53 ›› Issue (1): 187-194.doi: 10.11896/jsjkx.241100029
卜韵阳, 齐彬廷, 卜凡亮
BU Yunyang, QI Binting, BU Fanliang
摘要: 在社交媒体上,人们的评论通常会描述对应图像中的某一情感区域,图像和文本之间是具有对应信息的。以往的大多数多模态情感分析方法只是从单一视角探索图像和文本的相互影响,捕获图像区域和文本单词的对应关系,导致结果不是最优的。此外,社交媒体上的数据具有强烈的个人主观性,数据中的情感是多维和复杂的,导致出现了图像和文本情感一致性弱的数据。针对上述问题,提出了一种跨模态不一致感知下双视角交互融合的多模态情感分析模型。一方面,从全局和局部两种视角对图文特征进行跨模态交互,提供更全面、准确的情感分析,从而提升模型的表现和应用效果。另一方面,计算图文特征的不一致分数,用于代表图文不一致程度,以此来动态调控单模态表示和多模态表示的最终情感特征的权重,从而提高模型的鲁棒性。在MVSA-Single和MVSA-Multiple两个公共数据集上进行广泛实验,结果证明所提出的多模态情感分析模型与现有基线模型相比F1值分别提高0.59个百分点和0.39个百分点,具有有效性和优越性。
中图分类号:
| [1]ZHANG L,WANG S,LIU B.Deep learning for sentiment ana-lysis:A survey[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2018,8(4):e1253. [2]PANG L,ZHU S,NGO C W.Deep multimodal learning for affective analysis and retrieval[J].IEEE Transactions on Multimedia,2015,17(11):2008-2020. [3]ZHU T,LI L,YANG J,et al.Multimodal sentiment analysiswith image-text interaction network[J].IEEE Transactions on Multimedia,2022,25:3375-3385. [4]XU J,HUANG F,ZHANG X,et al.Visual-textual sentiment classification with bi-directional multi-level attention networks[J].Knowledge-Based Systems,2019,178:61-73. [5]TABOADA M,BROOKE J,TOFILOSKI M,et al.Lexicon-based methods for sentiment analysis[J].Computational linguistics,2011,37(2):267-307. [6]RAO Y,LEI J,LIU W,et al.Building emotional dictionary for sentiment analysis of online news[J].World Wide Web,2014,17:723-742. [7]HAMOUDA A,ROHAIM M.Reviews classification usingsentiwordnet lexicon[C]//World Congress on Computer Science and Information Technology.2011:104-105. [8]PANG B,LEE L,VAITHYANATHAN S.Thumbs up? Sentiment classification using machine learning techniques[C]//Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing.2002:79-86. [9]KIM Y.Convolutional Neural Networks for Sentence Classifica-tion[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.2014:1746-1751. [10]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642. [11]AKHTAR M S,GARG T,EKBAL A.Multi-task learning foraspect term extraction and aspect sentiment classification[J].Neurocomputing,2020,398:247-256. [12]MACHAJDIK J,HANBURY A.Affective image classification using features inspired by psychology and art theory[C]//Proceedings of the 18th ACM International Conference on Multimedia.2010:83-92. [13]SIERSDORFER S,MINACK E,DENG F,et al.Analyzing and predicting sentiment of images on the social web[C]//Procee-dings of the 18th ACM International Conference on Multimedia.2010:715-718. [14]BORTH D,JI R,CHEN T,et al.Large-scale visual sentimentontology and detectors using adjective noun pairs[C]//Procee-dings of the 21st ACM International Conference on Multimedia.2013:223-232. [15]YUAN J,MCDONOUGH S,YOU Q,et al.Sentribute:image sentiment analysis from a mid-level perspective[C]//Procee-dings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining.2013:1-8. [16]YOU Q,LUO J,JIN H,et al.Robust image sentiment analysis using progressively trained and domain transferred deep networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2015. [17]YANG J,SHE D,SUN M.Joint Image Emotion Classification and Distribution Learning via Deep Convolutional Neural Network[C]//IJCAI.2017:3266-3272. [18]LI Z,SUN Q,GUO Q,et al.Visual sentiment analysis based on image caption and adjective-noun-pair description[J].Soft Computing,2021:1-13. [19]WANG M,CAO D,LI L,et al.Microblog sentiment analysisbased on cross-media bag-of-words model[C]//Proceedings of International Conference on Internet Multimedia Computing and Service.2014:76-80. [20]YOU Q,LUO J,JIN H,et al.Joint visual-textual sentimentanalysis with deep neural networks[C]//Proceedings of the 23rd ACM International Conference on Multimedia.2015:1071-1074. [21]LI P,ZHONG P,ZHANG J,et al.Convolutional transformer with sentiment-aware attention for sentiment analysis[C]//2020 International Joint Conference on Neural Networks(IJCNN).IEEE,2020:1-8. [22]HE J,YANGA H,ZHANG C,et al.Dynamic Invariant-Specific Representation Fusion Network for Multimodal Sentiment Analysis[J].Computational Intelligence and Neuroscience,2022,2022(1):2105593. [23]LIU H,LI K,FAN J,et al.Social Image-Text Sentiment Classification With Cross-Modal Consistency and Knowledge Distillation[J].IEEE Transactions on Affective Computing,2022,14(4):3332-3344. [24]XU M,LIANG F,SU X,et al.Cmjrt:Cross-modal joint representation transformer for multimodal sentiment analysis[J].IEEE Access,2022,10:131671-131679. [25]CHEN D,SU W,WU P,et al.Joint multimodal sentiment analysis based on information relevance[J].Information Processing &Management,2023,60(2):103193. [26]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [27]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907,11692,2019. [28]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.. [29]WANG J,YANG Y,LIU K,et al.CiteNet:Cross-modal incongruity perception network for multimodal sentiment prediction[J].Knowledge-Based Systems,2024,295:111848. [30]ZHAN F,YU Y,WU R,et al.Multimodal image synthesis and editing:A survey and taxonomy[J].arXiv:2112.13592,2023. [31]NIU T,ZHU S,PANG L,et al.Sentiment analysis on multi-view social data[C]//MultiMedia Modeling:22nd International Conference(MMM 2016).Miami,FL,USA,Part II 22.2016:15-27. [32]XU N,MAO W.Multisentinet:A deep semantic network formultimodal sentiment analysis[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:2399-2402. [33]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2020:38-45. [34]LOSHCHILOV I,HUTTER F.Decoupled weight decay regularization[J].arXiv:1711,05101,2017. [35]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:207-212. [36]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810,04805,2018. [37]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2818-2826. [38]YANG X,FENG S,WANG D,et al.Image-text multimodalemotion classification via multi-view attentional network[J].IEEE Transactions on Multimedia,2020,23:4014-4026. [39]XU N.Analyzing multimodal public sentiment based on hierarchical semantic attentional network[C]//2017 IEEE International Conference on Intelligence and Security Informatics(ISI).2017,IEEE:152-154. [40]CAI G,XIA B.Convolutional neural networks for multimediasentiment analysis[C]//4th CCF Conference Natural Language Processing and Chinese Computing(NLPCC 2015).2015:159-167. [41]XU N,MAO W,CHEN G.A co-memory network for multimodal sentiment analysis[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:929-932. [42]YANG X,FENG S,ZHANG Y,et al.Multimodal sentiment detection based on multi-channel graph neural networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:328-339. [43]YE J,ZHOU J,TIAN J,et al.Sentiment-aware multimodal pre-training for multimodal sentiment analysis[J].Knowledge-Based Systems,2022,258:110021. [44]WEI Y,YUAN S,YANG R,et al.Tackling modality heterogeneity with multi-view calibration network for multimodal sentiment detection[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:5240-5252. [45]VAN DER MAATEN L,HINTON G.Visualizing data usingt-SNE[J].Journal of Machine Learning Research,2008,9(86):2579-2605. |
|
||