Computer Science ›› 2025, Vol. 52 ›› Issue (7): 241-247.doi: 10.11896/jsjkx.240600126

• Artificial Intelligence • Previous Articles     Next Articles

Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis

LI Maolin, LIN Jiajie, YANG Zhenguo   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-06-21 Revised:2024-09-26 Published:2025-07-17
  • About author:LI Maolin,born in 2001,postgraduate,is a member of CCF(No.V1444G).His main research interests include deep learning and sentiment analysis.
    YANG Zhenguo,born in 1988,Ph.D,associate professor,is a member of CCF(No.77775M).His main research interests include online event detection and domain adaptation.
  • Supported by:
    Guangdong Basic and Applied Basic Research Foundation(2024A1515010237).

Abstract: With the increasing volume of data from social media platforms,multimodal aspect-level sentiment analysis is crucial for understanding the underlying emotions of users.Existing research primarily focuses on sentiment analysis tasks by fusing image and text modalities,but these methods fail to effectively capture the implicit emotions in both image and text.Furthermore,traditional approaches are often constrained by the black-box nature of the models,which lack interpretability.To address these issues,this paper proposes a confidence-guided prompt learning(CPL) based multimodal aspect-level sentiment analysis model,which consists of four key components:a multimodal feature processing module(MF),a confidence-based gating module(CG),a prompt construction module(PC),and a multimodal classification module(MC).The multimodal feature processing module is responsible for extracting features from multimodal data.The confidence-guided gating module evaluates the classification difficulty of samples using confidence assessment through a self-attention network and adaptively processes samples based on their difficulty.The prompt construction module generates adaptive prompt templates for different difficulty levels of samples to guide the T5 large language model in generating auxiliary sentiment cues.And the multimodal classification module is used for final sentiment prediction.Experimental results on the public datasets Twitter-2015 and Twitter-2017 show that,compared to existing baseline methods,the proposed multimodal aspect-level sentiment classification model achieves significant performance improvements,with accuracy increases of 0.48% and 1.06%,respectively.

Key words: Multimodal data, Large language models, Sentiment classification, Prompt learning, Classification confidence

CLC Number: 

  • TP311
[1]AGARWALA,YADAV A,VISHWAKARMA D K.Multimodal sentiment analysis via RNN variants[C]//2019 IEEE International Conference on Big Data,Cloud Computing,Data Science &Engineering(BCD).IEEE,2019:19-23.
[2]TRUONG QT,LAUW H W.Vistanet:Visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:305-312.
[3]ZHANGD,LI S,ZHU Q,et al.Modeling the clause-level structure to multimodal sentiment analysis via reinforcement learning[C]//2019 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2019:730-735.
[4]YANG J,YU Y,NIU D,et al.ConFEDE:Contrastive FeatureDecomposition for Multimodal Sentiment Analysis[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:7617-7630.
[5]MAF,ZHANG Y,SUN X.Multimodal Sentiment Analysis with Preferential Fusion and Distance-aware Contrastive Learning[C]//2023 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2023:1367-1372.
[6]ZHOU R,GUO W,LIU X,et al.AoM:Detecting aspect-oriented information for multimodal aspect-based sentiment analysis[J].arXiv:2306.01004,2023.
[7]YU Y,ZHAO M,QI S,et al.ConKI:Contrastive knowledge injection for multimodal sentiment analysis[J].arXiv:2306.15796,2023.
[8]TANG D,QIN B,FENG X,et al.Effective LSTMs for target-de-pendent sentiment classification[J].arXiv:1512.01100,2015.
[9]WUH,CHENG S,WANG J,et al.Multimodal aspect extraction with region-aware alignment network[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:145-156.
[10]ZHANG Q,FU J,LIU X,et al.Adaptive co-attention networkfor named entity recognition in tweets[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence.New Orleans:AAAI Press,2018:5674-5681.
[11]ASGARI-CHENAGHLU M,FEIZI-DERAKHSHI M R,FARZ-INVASH L,et al.CWI:A multimodal deep learning approach for named entity recognition from social media using character,word and image features[J].Neural Computing and Applications,2022,34:1905-1922.
[12]LING Y,YU J,XIA R.Vision-language pre-training for multi-modal aspect-based sentiment analysis[J].arXiv:2204.07955,2022.
[13]YANGL,NA J C,YU J.Cross-modal multitask transformer for end-to-end multimodal aspect-based sentiment analysis[J].Information Processing & Management,2022,59(5):103038.
[14]ZHAOF,WU Z,LONG S,et al.Learning from adjective-noun pairs:A knowledge-enhanced framework for target-oriented multimodal sentiment classification[C]//Proceedings of the 29th International Conference on Computational Linguistics.2022:6784-6794.
[15]ZHAOF,LI C,WU Z,et al.M2DF:Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis[J].arXiv:2310.14605,2023.
[16]ZHOUR,GUO W,LIU X,et al.AoM:Detecting aspect-oriented information for multimodal aspect-based sentiment analysis[J].arXiv:2306.01004,2023.
[17]FEI H,LI B,LIU Q,et al.Reasoning implicit sentiment withchain-of-thought prompting[J].arXiv:2305.11255,2023.
[18]ZHUX,KUANG Z,ZHANG L.A prompt model with combined semantic refinement for aspect sentiment analysis[J].Information Processing & Management,2023,60(5):103462.
[19]YANGX,FENG S,WANG D,et al.Few-shot multimodal sentiment analysis based on multimodal probabilistic fusion prompts[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:6045-6053.
[20]LIUD,LI L,TAO X,et al.Descriptive Prompt Paraphrasing for Target-Oriented Multimodal Sentiment Classification[C]//The 2023 Conference on Empirical Methods in Natural Language Processing.2023.
[21]BAO Y,LI X,REN F.3M:Multi-Task Multi-Prompt LearningModel for Aspect Based Sentiment Analysis[C]//2023 IEEE 9th International Conference on Cloud Computing and Intelligent Systems(CCIS).IEEE,2023:451-456.
[22]SHIJ,LI W,BAI Q,et al.Soft prompt enhanced joint learning for cross-domain aspect-based sentiment analysis[J].Intelligent Systems with Applications,2023,20:200292.
[23]LIC,GAO F,BU J,et al.Sentiprompt:Sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis[J].arXiv:2109.08306,2021.
[24]LIJ,LI D,XIONG C,et al.Blip:Bootstrapping language-image pre-training for unified vision-language understanding and generation[C]//International Conference on Machine Learning.PMLR,2022:12888-12900.
[25]YU J,JIANG J.Adapting BERT for target-oriented multimodalsentiment classification.[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence.2019:5408-5414.
[26]VASWANIA,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2017:6000-6010.
[27]ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707.07250,2017.
[28]XU N,MAO W,CHEN G.Multi-interactive memory networkfor aspect based multimodal sentiment analysis[C]//Procee-dings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence.Honolulu,Hawaii:AAAI Press,2019:371-378.
[29]YUJ,JIANG J,XIA R.Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2019,28:429-439.
[30]ZHOUR,ZHU H Z,GUO W Y,et al.A Unified Framework Based on Multimodal Aspect-Term Extraction and Aspect-Level Sentiment Classification[J].Journal of Computer Research and Development,2023,60(12):2877-2889.
[31]WANGJ,LIU Z,SHENG V,et al.SaliencyBERT:Recurrent attention network for target-oriented multimodal sentiment classification[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Cham:Springer,2021:3-15.
[32]LUJ,BATRA D,PARIKH D,et al.VilBERT:pretraining task-agnostic visiolinguistic representations for vision-and-language tasks[J].Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2019:12-23.
[33]SUN C,HUANG L,QIU X.Utilizing BERT for aspect-basedsentiment analysis via constructing auxiliary sentence[J].ar-Xiv:1903.09588,2019.
[34]YANG H,ZHAO Y,QIN B.Face-sensitive image-to-emotional-text cross-modal translation for multimodal aspect-based sentiment analysis[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:3324-3335.
[35]KHAN Z,FU Y.Exploiting BERT for multimodal target sentiment classification through input space translation[C]//Proceedings of the 29th ACM International Conference on Multimedia.New York:ACM,2021:3034-3042.
[1] CHEN Jinyin, XI Changkun, ZHENG Haibin, GAO Ming, ZHANG Tianxin. Survey of Security Research on Multimodal Large Language Models [J]. Computer Science, 2025, 52(7): 315-341.
[2] LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7.
[3] HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4.
[4] HUANG Zhiyong, LI Bicheng, WEI Wei. Aspect-level Sentiment Analysis Models Based on Syntax and Semantics [J]. Computer Science, 2025, 52(6A): 240400193-7.
[5] GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329.
[6] WANG Xiaoyi, WANG Jiong, LIU Jie, ZHOU Jianshe. Study on Text Component Recognition of Narrative Texts Based on Prompt Learning [J]. Computer Science, 2025, 52(6): 330-335.
[7] WANG Tianyi, LIN Youfang, GONG Letian, CHEN Wei, GUO Shengnan, WAN Huaiyu. Check-in Trajectory and User Linking Based on Natural Language Augmentation [J]. Computer Science, 2025, 52(2): 99-106.
[8] DUN Jingbo, LI Zhuo. Survey on Transmission Optimization Technologies for Federated Large Language Model Training [J]. Computer Science, 2025, 52(1): 42-55.
[9] ZHENG Mingqi, CHEN Xiaohui, LIU Bing, ZHANG Bing, ZHANG Ran. Survey of Chain-of-Thought Generation and Enhancement Methods in Prompt Learning [J]. Computer Science, 2025, 52(1): 56-64.
[10] LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models [J]. Computer Science, 2025, 52(1): 72-79.
[11] CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93.
[12] MO Shuyuan, MENG Zuqiang. Multimodal Sentiment Analysis Model Based on Visual Semantics and Prompt Learning [J]. Computer Science, 2024, 51(9): 250-257.
[13] LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264.
[14] LIU Yumeng, ZHAO Yijing, WANG Bicong, WANG Chao, ZHANG Baomin. Advances in SQL Intelligent Synthesis Technology [J]. Computer Science, 2024, 51(7): 40-48.
[15] BAI Yu, WANG Xinzhe. Study on Hypernymy Recognition Based on Combined Training of Attention Mechanism and Prompt Learning [J]. Computer Science, 2024, 51(6A): 230700226-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!