计算机科学 ›› 2025, Vol. 52 ›› Issue (7): 241-247.doi: 10.11896/jsjkx.240600126

• 人工智能 • 上一篇    下一篇

基于置信度引导提示学习的多模态方面级情感分析

李懋林, 林嘉杰, 杨振国   

  1. 广东工业大学计算机科学与技术学院 广州 510006
  • 收稿日期:2024-06-21 修回日期:2024-09-26 发布日期:2025-07-17
  • 通讯作者: 杨振国(yzg@gdut.edu.cn)
  • 作者简介:(gdutlml@gmail.com)
  • 基金资助:
    广东省基础与应用基础研究基金(2024A1515010237)

Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis

LI Maolin, LIN Jiajie, YANG Zhenguo   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-06-21 Revised:2024-09-26 Published:2025-07-17
  • About author:LI Maolin,born in 2001,postgraduate,is a member of CCF(No.V1444G).His main research interests include deep learning and sentiment analysis.
    YANG Zhenguo,born in 1988,Ph.D,associate professor,is a member of CCF(No.77775M).His main research interests include online event detection and domain adaptation.
  • Supported by:
    Guangdong Basic and Applied Basic Research Foundation(2024A1515010237).

摘要: 面对日益增加的社交平台数据,多模态方面级情感分析对于理解用户的潜在情感至关重要。现有研究工作集中于通过跨模态融合图像和文本来完成情感分析任务,无法有效地捕获图像和文本中的隐含情感。此外,传统方法受限于模型具有的黑箱性质而缺乏可解释性。为应对上述问题,提出了基于置信度引导的提示学习(CPL)的多模态方面级情感分类模型。该模型由多模态特征处理模块(MF)、基于置信度的门控模块(CG)、提示构建模块(PC)和多模态分类模块(MC)组成。多模态特征提取模块用以提取多模态数据的特征;基于置信度的门控模块旨在通过自注意力网络的置信度评估样本的分类难度,对不同难易程度的样本进行自适应性处理;提示构建模块根据难易样本,采取不同的适应性模板提示,以引导T5大语言模型生成辅助情感线索;多模态分类模块用以预测结果。在公开数据集Twitter-2015和Twitter-2017的实验结果表明,与现有基线方法相比,所提出的多模态方面级情感分类模型具有显著性能优势,准确率分别提高了0.48%和1.06%。

关键词: 多模态数据, 大语言模型, 情感分类, 提示学习, 分类置信度

Abstract: With the increasing volume of data from social media platforms,multimodal aspect-level sentiment analysis is crucial for understanding the underlying emotions of users.Existing research primarily focuses on sentiment analysis tasks by fusing image and text modalities,but these methods fail to effectively capture the implicit emotions in both image and text.Furthermore,traditional approaches are often constrained by the black-box nature of the models,which lack interpretability.To address these issues,this paper proposes a confidence-guided prompt learning(CPL) based multimodal aspect-level sentiment analysis model,which consists of four key components:a multimodal feature processing module(MF),a confidence-based gating module(CG),a prompt construction module(PC),and a multimodal classification module(MC).The multimodal feature processing module is responsible for extracting features from multimodal data.The confidence-guided gating module evaluates the classification difficulty of samples using confidence assessment through a self-attention network and adaptively processes samples based on their difficulty.The prompt construction module generates adaptive prompt templates for different difficulty levels of samples to guide the T5 large language model in generating auxiliary sentiment cues.And the multimodal classification module is used for final sentiment prediction.Experimental results on the public datasets Twitter-2015 and Twitter-2017 show that,compared to existing baseline methods,the proposed multimodal aspect-level sentiment classification model achieves significant performance improvements,with accuracy increases of 0.48% and 1.06%,respectively.

Key words: Multimodal data, Large language models, Sentiment classification, Prompt learning, Classification confidence

中图分类号: 

  • TP311
[1]AGARWALA,YADAV A,VISHWAKARMA D K.Multimodal sentiment analysis via RNN variants[C]//2019 IEEE International Conference on Big Data,Cloud Computing,Data Science &Engineering(BCD).IEEE,2019:19-23.
[2]TRUONG QT,LAUW H W.Vistanet:Visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:305-312.
[3]ZHANGD,LI S,ZHU Q,et al.Modeling the clause-level structure to multimodal sentiment analysis via reinforcement learning[C]//2019 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2019:730-735.
[4]YANG J,YU Y,NIU D,et al.ConFEDE:Contrastive FeatureDecomposition for Multimodal Sentiment Analysis[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:7617-7630.
[5]MAF,ZHANG Y,SUN X.Multimodal Sentiment Analysis with Preferential Fusion and Distance-aware Contrastive Learning[C]//2023 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2023:1367-1372.
[6]ZHOU R,GUO W,LIU X,et al.AoM:Detecting aspect-oriented information for multimodal aspect-based sentiment analysis[J].arXiv:2306.01004,2023.
[7]YU Y,ZHAO M,QI S,et al.ConKI:Contrastive knowledge injection for multimodal sentiment analysis[J].arXiv:2306.15796,2023.
[8]TANG D,QIN B,FENG X,et al.Effective LSTMs for target-de-pendent sentiment classification[J].arXiv:1512.01100,2015.
[9]WUH,CHENG S,WANG J,et al.Multimodal aspect extraction with region-aware alignment network[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:145-156.
[10]ZHANG Q,FU J,LIU X,et al.Adaptive co-attention networkfor named entity recognition in tweets[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence.New Orleans:AAAI Press,2018:5674-5681.
[11]ASGARI-CHENAGHLU M,FEIZI-DERAKHSHI M R,FARZ-INVASH L,et al.CWI:A multimodal deep learning approach for named entity recognition from social media using character,word and image features[J].Neural Computing and Applications,2022,34:1905-1922.
[12]LING Y,YU J,XIA R.Vision-language pre-training for multi-modal aspect-based sentiment analysis[J].arXiv:2204.07955,2022.
[13]YANGL,NA J C,YU J.Cross-modal multitask transformer for end-to-end multimodal aspect-based sentiment analysis[J].Information Processing & Management,2022,59(5):103038.
[14]ZHAOF,WU Z,LONG S,et al.Learning from adjective-noun pairs:A knowledge-enhanced framework for target-oriented multimodal sentiment classification[C]//Proceedings of the 29th International Conference on Computational Linguistics.2022:6784-6794.
[15]ZHAOF,LI C,WU Z,et al.M2DF:Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis[J].arXiv:2310.14605,2023.
[16]ZHOUR,GUO W,LIU X,et al.AoM:Detecting aspect-oriented information for multimodal aspect-based sentiment analysis[J].arXiv:2306.01004,2023.
[17]FEI H,LI B,LIU Q,et al.Reasoning implicit sentiment withchain-of-thought prompting[J].arXiv:2305.11255,2023.
[18]ZHUX,KUANG Z,ZHANG L.A prompt model with combined semantic refinement for aspect sentiment analysis[J].Information Processing & Management,2023,60(5):103462.
[19]YANGX,FENG S,WANG D,et al.Few-shot multimodal sentiment analysis based on multimodal probabilistic fusion prompts[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:6045-6053.
[20]LIUD,LI L,TAO X,et al.Descriptive Prompt Paraphrasing for Target-Oriented Multimodal Sentiment Classification[C]//The 2023 Conference on Empirical Methods in Natural Language Processing.2023.
[21]BAO Y,LI X,REN F.3M:Multi-Task Multi-Prompt LearningModel for Aspect Based Sentiment Analysis[C]//2023 IEEE 9th International Conference on Cloud Computing and Intelligent Systems(CCIS).IEEE,2023:451-456.
[22]SHIJ,LI W,BAI Q,et al.Soft prompt enhanced joint learning for cross-domain aspect-based sentiment analysis[J].Intelligent Systems with Applications,2023,20:200292.
[23]LIC,GAO F,BU J,et al.Sentiprompt:Sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis[J].arXiv:2109.08306,2021.
[24]LIJ,LI D,XIONG C,et al.Blip:Bootstrapping language-image pre-training for unified vision-language understanding and generation[C]//International Conference on Machine Learning.PMLR,2022:12888-12900.
[25]YU J,JIANG J.Adapting BERT for target-oriented multimodalsentiment classification.[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence.2019:5408-5414.
[26]VASWANIA,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2017:6000-6010.
[27]ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707.07250,2017.
[28]XU N,MAO W,CHEN G.Multi-interactive memory networkfor aspect based multimodal sentiment analysis[C]//Procee-dings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence.Honolulu,Hawaii:AAAI Press,2019:371-378.
[29]YUJ,JIANG J,XIA R.Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2019,28:429-439.
[30]ZHOUR,ZHU H Z,GUO W Y,et al.A Unified Framework Based on Multimodal Aspect-Term Extraction and Aspect-Level Sentiment Classification[J].Journal of Computer Research and Development,2023,60(12):2877-2889.
[31]WANGJ,LIU Z,SHENG V,et al.SaliencyBERT:Recurrent attention network for target-oriented multimodal sentiment classification[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Cham:Springer,2021:3-15.
[32]LUJ,BATRA D,PARIKH D,et al.VilBERT:pretraining task-agnostic visiolinguistic representations for vision-and-language tasks[J].Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2019:12-23.
[33]SUN C,HUANG L,QIU X.Utilizing BERT for aspect-basedsentiment analysis via constructing auxiliary sentence[J].ar-Xiv:1903.09588,2019.
[34]YANG H,ZHAO Y,QIN B.Face-sensitive image-to-emotional-text cross-modal translation for multimodal aspect-based sentiment analysis[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:3324-3335.
[35]KHAN Z,FU Y.Exploiting BERT for multimodal target sentiment classification through input space translation[C]//Proceedings of the 29th ACM International Conference on Multimedia.New York:ACM,2021:3034-3042.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!