Computer Science ›› 2020, Vol. 47 ›› Issue (11): 250-254.doi: 10.11896/jsjkx.190800154

• Artificial Intelligence • Previous Articles     Next Articles

Visual Sentiment Prediction with Visual Semantic Embedding and Attention Mechanism

LAN Yi-lun, MENG Min, WU Ji-gang   

  1. Department of Computer Science,Guangdong University of Technology,Guangzhou 510006,China
  • Received:2019-08-29 Revised:2019-11-22 Online:2020-11-15 Published:2020-11-05
  • About author:LAN Yi-lun,born in 1995,postgra-duate.His main research interests include visual sentiment prediction and image classification
    MENG Min,born in 1985,Ph.D,asso-ciate professor,postgraduate supervisor,is a member of China Computer Federation.Her main research interests include image processing and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61702114) and Guangdong Key R&D Project of China (2019B010121001).

Abstract: In order to bridge the semantic gap between visual features and sentiments and reduce the impact of sentiment irrelevant regions in the image,this paper presents a novel visual sentiment prediction method by integrating visual semantic embedding and attention mechanism.Firstly,the method employs the auto-encoder to learn joint embedding of image features and semantic features,so as to alleviate the difference between the low-level visual features and the high-level semantic features.Secondly,a set of salient region features are extracted as input to the attention model,in which the correlations between salient regions and joint embedding features can be established to discover sentiment relevant regions.Finally,the sentiment classifier is built on top of these regions for visual sentiment prediction.The experimental results show that,the proposed method significantly improves the classification performance on testing samples and outperforms the state-of-the-art algorithms on visual sentiment analysis.

Key words: Visual sentiment prediction, Visual semantic embedding, Attention mechanism, Salient regions detection

CLC Number: 

  • TP391.41
[1] PANG B,LEE L.Opinion mining and sentiment analysis[J].Foundations and Trends® in Information Retrieval,2008,2(1/2):1-135.
[2] YANG J,SHE D,SUN M,et al.Visual sentiment predictionbased on automatic discovery of affective regions[J].IEEE Transactions on Multimedia,2018,20(9):2513-2525.
[3] YOU Q,JIN H,LUO J.Visual sentiment analysis by attendingon local image regions[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017:231-237.
[4] SONG K,YAO T,LING Q,et al.Boosting image sentimentanalysis with visual attention[J].Neurocomputing,2018,312:218-228.
[5] FAN S,JIANG M,SHEN Z,et al.The Role of Visual Attention in Sentiment Prediction[C]//Proceedings of the 25th ACM International Conference on Multimedia.ACM,2017:217-225.
[6] FAN S,SHEN Z,JIANG M,et al.Emotional attention:A study of image sentiment and visual attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7521-7531.
[7] ANDERSON P,HE X,BUEHLER C,et al.Bottom-up and top-down attention for image captioning and visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6077-6086.
[8] WEI-NING W,YING-LIN Y,SHENG-MING J.Image retrieval by emotional semantics:A study of emotional space and feature extraction[C]//2006 IEEE International Conference on Systems,Man and Cybernetics.IEEE,2006,4:3534-3539.
[9] MACHAJDIK J,HANBURY A.Affective image classificationusing features inspired by psychology and art theory[C]//Proceedings of the 18th ACM international conference on Multimedia.ACM,2010:83-92.
[10] ZHAO S,GAO Y,JIANG X,et al.Exploring principles-of-art features for image emotion recognition[C]//Proceedings of the 22nd ACM international conference on Multimedia.ACM,2014:47-56.
[11] BORTH D,JI R,CHEN T,et al.Large-scale visual sentimentontology and detectors using adjective noun pairs[C]//Proceedings of the 21st ACM International Conference on Multimedia.ACM,2013:223-232.
[12] LI Z,FAN Y,LIU W,et al.Image sentiment prediction based on textual descriptions with adjective noun pairs[J].Multimedia Tools and Applications,2018,77(1):1115-1132.
[13] CAMPOS V,JOU B,GIRO-I-NIETO X.From pixels to sentiment:Fine-tuning CNNs for visual sentiment prediction[J].Ima-ge and Vision Computing,2017,65:15-22.
[14] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE conference on computer vision and pattern recognition.IEEE,2009:248-255.
[15] ZHU X,LI L,ZHANG W,et al.Dependency Exploitation:AUnified CNN-RNN Approach for Visual Emotion Recognition[C]//IJCAI.2017:3595-3601.
[16] BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy layer-wise training of deep networks[C]//Advances in Neural Information Processing Systems.2007:153-160.
[17] VINCENT P,LAROCHELLE H,LAJOIE I,et al.Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion[J].Journal of Machine Learning Research,2010,11(Dec):3371-3408.
[18] XU K,BA J,KIROS R,et al.Show,attend and tell:Neural image caption generation with visual attention[C]//International Conference on Machine Learning.2015:2048-2057.
[19] FAN S,NG T T,HERBERG J S,et al.An automated estimator of image visual realism based on human cognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:4201-4208.
[20] CHEN T,BORTH D,DARRELL T,et al.Deepsentibank:Visual sentiment concept classification with deep convolutional neural networks[J].arXiv:1410.8586,2014.
[21] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convol-utional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[22] YOU Q,CAO L,JIN H,et al.Robust visual-textual sentiment analysis:When attention meets tree-structured recursive neural networks[C]//Proceedings of the 24th ACM International Conference on Multimedia.ACM,2016:1008-1017.
[23] CAMPOS V,JOU B,GIRO-I-NIETO X.From pixels to sentiment:Fine-tuning CNNs for visual sentiment prediction[J].Ima-ge and Vision Computing,2017,65:15-22.
[1] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[2] ZHAO Wei, LIN Yu-ming, WANG Chao-qiang, CAI Guo-yong. Opinion Word-pairs Collaborative Extraction Based on Dependency Relation Analysis [J]. Computer Science, 2020, 47(8): 164-170.
[3] YUAN Ye, HE Xiao-ge, ZHU Ding-kun, WANG Fu-lee, XIE Hao-ran, WANG Jun, WEI Ming-qiang, GUO Yan-wen. Survey of Visual Image Saliency Detection [J]. Computer Science, 2020, 47(7): 84-91.
[4] LIU Yan, WEN Jing. Complex Scene Text Detection Based on Attention Mechanism [J]. Computer Science, 2020, 47(7): 135-140.
[5] YU Yi-lin, TIAN Hong-tao, GAO Jian-wei and WAN Huai-yu. Relation Extraction Method Combining Encyclopedia Knowledge and Sentence Semantic Features [J]. Computer Science, 2020, 47(6A): 40-44.
[6] NI Hai-qing, LIU Dan, SHI Meng-yu. Chinese Short Text Summarization Generation Model Based on Semantic-aware [J]. Computer Science, 2020, 47(6): 74-78.
[7] HUANG Yong-tao, YAN Hua. Scene Graph Generation Model Combining Attention Mechanism and Feature Fusion [J]. Computer Science, 2020, 47(6): 133-137.
[8] ZHANG Zhi-yang, ZHANG Feng-li, CHEN Xue-qin, WANG Rui-jin. Information Cascade Prediction Model Based on Hierarchical Attention [J]. Computer Science, 2020, 47(6): 201-209.
[9] DENG Yi-jiao, ZHANG Feng-li, CHEN Xue-qin, AI Qing, YU Su-zhe. Collaborative Attention Network Model for Cross-modal Retrieval [J]. Computer Science, 2020, 47(4): 54-59.
[10] ZHANG Peng-fei, LI Guan-yu, JIA Cai-yan. Truncated Gaussian Distance-based Self-attention Mechanism for Natural Language Inference [J]. Computer Science, 2020, 47(4): 178-183.
[11] ZHANG Yi-fei,WANG Zhong-qing,WANG Hong-ling. Product Review Summarization Using Discourse Hierarchical Structure [J]. Computer Science, 2020, 47(2): 195-200.
[12] KANG Yan, BU Rong-jing, LI Hao, YANG Bing, ZHANG Ya-chuan, CHEN Tie. Neural Collaborative Filtering Based on Enhanced-attention Mechanism [J]. Computer Science, 2020, 47(10): 114-120.
[13] WANG Qi-fa, WANG Zhong-qing, LI Shou-shan, ZHOU Guo-dong. Comment Sentiment Classification Using Cross-attention Mechanism and News Content [J]. Computer Science, 2020, 47(10): 222-227.
[14] LI Yuan,LI Zhi-xing,TENG Lei,WANG Hua-ming,WANG Guo-yin. Comment Sentiment Analysis and Sentiment Words Detection Based on Attention Mechanism [J]. Computer Science, 2020, 47(1): 186-192.
[15] YANG Dan-hao,WU Yue-xin,FAN Chun-xiao. Chinese Short Text Keyphrase Extraction Model Based on Attention [J]. Computer Science, 2020, 47(1): 193-198.
Full text



[1] . [J]. Computer Science, 2020, 47(8): 0 .
[2] LU Long-long, CHEN Tong, PAN Min-xue, ZHANG Tian. CodeSearcher:Code Query Using Functional Descriptions in Natural Languages[J]. Computer Science, 2020, 47(9): 1 -9 .
[3] OUYANG Peng, LU Lu, ZHANG Fan-long, QIU Shao-jian. Cross-project Clone Consistency Prediction via Transfer Learning and Oversampling Technology[J]. Computer Science, 2020, 47(9): 10 -16 .
[4] . [J]. Computer Science, 2020, 47(10): 2 .
[5] ZHANG Chun-xiang, ZHAO Chun-lei, CHEN Chao, LUO Hui. Review of Human Activity Recognition Based on Mobile Phone Sensors[J]. Computer Science, 2020, 47(10): 1 -8 .
[6] . [J]. Computer Science, 2020, 47(11): 0 .
[7] WANG Chun-dong, LUO Wan-wei, MO Xiu-liang, YANG Wen-jun. Survey on Mutual Trust Authentication and Secure Communication of Internet of Vehicles[J]. Computer Science, 2020, 47(11): 1 -9 .
[8] CHENG Qing-feng, LI Yu-ting, LI Xing-hua, JIANG Qi. Research on Application of Cryptography Technology for Edge Computing Environment[J]. Computer Science, 2020, 47(11): 10 -18 .
[9] YAO Mu-yan, TAO Dan. Implicit Authentication Mechanism of Pattern Unlock Based on Over-sampling and One-class Classification for Smartphones[J]. Computer Science, 2020, 47(11): 19 -24 .
[10] ZHOU Zhi-yi, SHONG Bing, DUAN Peng-song, CAO Yang-jie. LWID:Lightweight Gait Recognition Model Based on WiFi Signals[J]. Computer Science, 2020, 47(11): 25 -31 .