计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 298-306.doi: 10.11896/jsjkx.220100156
陈真1, 普园媛1,2, 赵征鹏1, 徐丹1, 钱文华1
CHEN Zhen1, PU Yuanyuan1,2, ZHAO Zhengpeng1, XU Dan1, QIAN Wenhua1
摘要: 多模态情感分析的目标是使用由多种模态提供的互补信息来实现可靠和稳健的情感分析。近年来,通过神经网络提取深层语义特征,在多模态情感分析任务中取得了显著的效果。而多模态信息的不同层次的特征融合也是决定情感分析效果的重要环节。因此,提出了一种基于自适应门控信息融合的多模态情感分析模型(AGIF)。首先,通过门控信息融合网络将Swin Transformer和ResNet提取的不同层次的视觉和色彩特征根据对情感分析的贡献进行有机融合。其次,由于情感的抽象性和复杂性,图像的情感往往由多个细微的局部区域体现,而迭代注意可以根据过去的信息精准定位这些情感判别区域。针对Word2Vec和GloVe无法解决一词多义的问题,采用了最新的ERNIE预训练模型。最后,利用自动融合网络“动态”融合各模态特征,解决了(拼接或TFN)确定性操作构建多模态联合表示所带来的信息冗余问题。在3个公开的真实数据集上进行了大量实验,证明了该模型的有效性。
中图分类号:
[1]KAGAN V,STEVENS A,SUBRAHMANIAN V S.UsingTwitter Sentiment to Forecast the 2013 Pakistani Election and the 2014 Indian Election [J].IEEE Intelligent Systems,2015,30(1):2-5. [2]BOLLEN J,MAO H N,ZENG S J.Twitter mood predicts the stock market [J].Journal of Computational Science,2011,2(1):1-8. [3]LI X D,XIE H R,CHEN L,et al.News impact on stock price return via sentiment analysis [J].Knowledge-Based Systems,2014,69(15):14-23. [4]HUR M,KANG P,CHO S.Box-office forecasting based on sentiments of movie reviews and Independent subspace method [J].Information Sciences,2016:608-624. [5]XU N,MAO W J.MultiSentiNet:A Deep Semantic Network for Multimodal Sentiment Analysis[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:2399-2402. [6]HUANG F R,ZHANG X M,ZHAO Z H,et al.Image-text sentiment analysis via deep multimodal attentive fusion [J].Know-ledge-Based Systems,2019:167:26-37. [7]LIN M H,MENG Z Q.Multimodal Sentiment Analysis Based on Attention Neural Network [J].Computer Science,2020,47(S2):508-514,548. [8]XU N,MAO W J,CHEN G D.A co-memory network for multimodal sentiment analysis[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:929-932. [9]XU J,HUANG F R,ZHANG X M,et al.Visual-textual sentiment classification with bi-directional multi-level attention networks [J].Knowledge Based Systems,2019,178(AUG.15):61-73. [10]YANG X C,FENG S,WAND D L,et al.Image-Text Multimodal Emotion Classification via Multi-View Attentional Network [J].IEEE Transactions on Multimedia,2021,23(1):4014-4026. [11]ANDERSON P,HE X D,BUEHLER C,et al.Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6077-6086. [12]JIANG H,MISRA I,ROHRBACH M,et al.In defense of grid features for visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10264-10273. [13]WEN Z,PENG Y.Multi-level knowledge injecting for visualcommonsense reasoning [J].IEEE Transactions on Circuits and Systems for Video Technology,2020,31(3):1042-1054. [14]ENGIN D,SCHNITZLER F,DUONG N Q K,et al.On the hidden treasure of dialog in video question answering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:2064-2073. [15]MIKOLOV T,CORRADO G,KAI C,et al.Efficient Estimation of Word Representations in Vector Space [J].Advances in Neural Information Processing Systems,2013,26(1):3111-3119. [16]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.EMNLP,2014:1532-1543. [17]LIU Z,LIN Y T,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2021:10012-10022. [18]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:770-778. [19]ZHAND X D,GAO X B,LU W,et al.A Gated Peripheral-Fo-veal Convolutional Neural Network for Unified Image Aesthetic Prediction [J].IEEE Transactions on Multimedia,2019,21(11):2815-2826. [20]MNIH V,HEESS N,GRAVES A,et al.Recurrent models ofvisual attention[C]//Proceedings of the Neural Information Processing Systems.2014:2204-2212. [21]SUN Y,WANG S,LI Y,et al.ERNIE 2.0:A Continual Pre-Training Framework for Language Understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8968-8975. [22]HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8) 2011-2023. [23]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling [J].arXiv::1412.3555,2014. [24]ZHAO L,SHANG M,GAO F,et al.Representation learning of image composition for aesthetic prediction [J].Computer Vision and Image Understanding,2020,199(9):103024. [25]ZADEH A,CHEN M,PORIA S,et al.Tensor Fusion Network for Multimodal Sentiment Analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Copenhagen Denmark,2017:1103-1114. [26]TENG N,ZHU S,LEI P,et al.Sentiment analysis on multi-view social data[C]//International Conference on Multimedia Mode-ling.2016:15-27. [27]MACHAJDIK J,HANBURY A.Affective image classificationusing features inspired by psychology and art theory[C]//Proceedings of the 18th ACM International Conference on Multimedia.New York,NY,USA,2010:83-92. [28]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Net-works for Large-Scale Image Recognition [J].arXiv:1409.1556,2014. [29]SONG K K,YAO T,LING Q,et al.Boosting Image Sentiment Analysis with Visual Attention [J].Neurocomputing,2018,312(27):218-228. [30]CAI G Y,CHU Y Y.Visual SentimentAnalysis Based on Multi-level Features Fusion of Dual Attention [J].Computer Engineering,2021,47(9):227-234. [31]HU A,FLAXMAN S.Multimodal Sentiment Analysis To Explore the Structure of Emotions[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Disco-very & Data Mining.2018:350-358. [32]GUO K X,ZHANG Y X.Visual-textual sentiment analysis method with multi-level spatial attention [J].Journal of Computer Applications,2021,41(10):2835-2841. |
[1] | 李帅, 徐彬, 韩祎珂, 廖同鑫. SS-GCN:情感增强和句法增强的方面级情感分析模型 SS-GCN:Aspect-based Sentiment Analysis Model with Affective Enhancement and Syntactic Enhancement 计算机科学, 2023, 50(3): 3-11. https://doi.org/10.11896/jsjkx.220700238 |
[2] | 汪璟玢, 赖晓连, 林新宇, 杨心逸. 基于关系约束的上下文感知时态知识图谱补全 Context-aware Temporal Knowledge Graph Completion Based on Relation Constraints 计算机科学, 2023, 50(3): 23-33. https://doi.org/10.11896/jsjkx.220400255 |
[3] | 陈富强, 寇嘉敏, 苏利敏, 李克. 基于图神经网络的多信息优化实体对齐模型 Multi-information Optimized Entity Alignment Model Based on Graph Neural Network 计算机科学, 2023, 50(3): 34-41. https://doi.org/10.11896/jsjkx.220700242 |
[4] | 邓亮, 齐攀虎, 刘振龙, 李敬鑫, 唐积强. BGPNRE:一种基于BERT的全局指针网络实体关系联合抽取方法 BGPNRE:A BERT-based Global Pointer Network for Named Entity-Relation Joint Extraction Method 计算机科学, 2023, 50(3): 42-48. https://doi.org/10.11896/jsjkx.220600239 |
[5] | 李志飞, 赵月, 张龑. 基于表示学习的知识图谱推理研究综述 Survey of Knowledge Graph Reasoning Based on Representation Learning 计算机科学, 2023, 50(3): 94-113. https://doi.org/10.11896/jsjkx.220900136 |
[6] | 饶丹, 时宏伟. 基于深度聚类的航空交通流识别与异常检测研究 Study on Air Traffic Flow Recognition and Anomaly Detection Based on Deep Clustering 计算机科学, 2023, 50(3): 121-128. https://doi.org/10.11896/jsjkx.220100086 |
[7] | 段顺然, 尹美娟, 刘粉林, 焦隆隆, 于岚岚. 一种基于影响力预测的节点排序模型 Nodes’ Ranking Model Based on Influence Prediction 计算机科学, 2023, 50(3): 155-163. https://doi.org/10.11896/jsjkx.211200261 |
[8] | 董永峰, 黄港, 薛婉若, 李林昊. 融合IRT的图注意力深度知识追踪模型 Graph Attention Deep Knowledge Tracing Model Integrated with IRT 计算机科学, 2023, 50(3): 173-180. https://doi.org/10.11896/jsjkx.211200134 |
[9] | 梅鹏程, 杨吉斌, 张强, 黄翔. 一种基于三维卷积的声学事件联合估计方法 Sound Event Joint Estimation Method Based on Three-dimension Convolution 计算机科学, 2023, 50(3): 191-198. https://doi.org/10.11896/jsjkx.220500259 |
[10] | 白雪飞, 马亚楠, 王文剑. 基于特征融合的边缘引导乳腺超声图像分割方法 Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion 计算机科学, 2023, 50(3): 199-207. https://doi.org/10.11896/jsjkx.211200294 |
[11] | 刘航, 普园媛, 吕大华, 赵征鹏, 徐丹, 钱文华. 极化自注意力约束颜色溢出的图像自动上色 Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image 计算机科学, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149 |
[12] | 刘松岳, 王欢. 基于多粒度特征融合的叶片分类与分级方法 Leaf Classification and Ranking Method Based on Multi-granularity Feature Fusion 计算机科学, 2023, 50(3): 216-222. https://doi.org/10.11896/jsjkx.211100203 |
[13] | 张卫良, 陈秀宏. 跨层融合和感受野扩增的SSD目标检测算法 SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification 计算机科学, 2023, 50(3): 231-237. https://doi.org/10.11896/jsjkx.211100281 |
[14] | 陈亮, 王璐, 李生春, 刘昌宏. 基于深度学习的可视化仪表板生成技术研究 Study on Visual Dashboard Generation Technology Based on Deep Learning 计算机科学, 2023, 50(3): 238-245. https://doi.org/10.11896/jsjkx.230100064 |
[15] | 张译, 吴秦. 特征增强损失与前景注意力人群计数网络 Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention 计算机科学, 2023, 50(3): 246-253. https://doi.org/10.11896/jsjkx.220100219 |
|