Computer Science ›› 2021, Vol. 48 ›› Issue (6A): 122-126.doi: 10.11896/jsjkx.201100026

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Cross Media Retrieval Method Based on Residual Attention Network

FENG Jiao, LU Chang-yu   

  1. College of Electronic and Information Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:FENG Jiao,Ph.D,associate professor.Her main research interests include signal processing and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61501244).

Abstract: With the rapid development of multimedia technology,cross-media retrieval has gradually replaced traditional single-media retrieval as the mainstream information retrieval method.Existing cross-media retrieval methods are highly complex,and cannot fully mine the detailed characteristics of the data,which will cause deviations in the mapping process,and it is difficult to learn accurate data associations.To solve the above problems,this paper proposes a cross-media retrieval method based onresidualattention network(CR-RAN).First of all,in order to better extract the key features of different media data and simplify the cross-media retrieval model,this paper proposes a residual neural network incorporating the attention mechanism.Then this paper proposes a cross-media retrieval joint loss function,which enhances the semantic discrimination ability of the network and improves the accuracy of network retrieval by constraining the mapping process of the network.Experimental results show that,compared with some existing methods,the cross-media retrieval method based on residual attention network proposed in this paper can better learn the association between different media data and effectively improve the accuracy of cross-media retrieval.

Key words: Attention mechanism, Cross media retrieval, Joint loss function, Residual neural network

CLC Number: 

  • TP391
[1] QI J W,PENG Y X,YUAN Y X.Cross-media retrieval withhierarchical recurrent attention network[J].Journal of Image and Graphics,2018,23(11):1751-1758.
[2] PENG Y X,QI J W,HUANG X.Current Research and Prospects on Multimedia Content Understanding[J].Journal of Computer Research and Development,2019,56(1):183-208.
[3] ZHUO Y K,QI J W,PENG Y X.Cross-media deep fine-grained correlation learning[J].Ruan Jian Xue Bao/Journal of Software,2019,30(4):884-895.
[4] HOTELLING H.Relation between two sets of variates [J].Biometrika,1936,28(3/4):321-377.
[5] HARDOON D R,SZEDMAK S,SHAWE-TAYLOR J.Canonical Correlation Analysis:An Overview with Application to Learning Methods[J].Neural Computation,2004,16(12):2639-2664.
[6] RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A New Approach to Cross-Modal Multimedia Retrieval [C]//International Conference on Multimedia.2010:251-260.
[7] ZHANG B,HAO J,MA G,et al.Automatic image annotation based on semi-paired probabilistic canonical correlation anlysis[J].Ruan Jian Xue Bao/Journal of Software,2017,28(2):292-309.
[8] ANDREW G,ARORA R,BILMES J,et al.Deep Canonical Correlation Analysis[C]//ICML.2013.
[9] PENG Y X,HUANG X,QI J W.Cross-media shared representation by hierarchical learning with multiple deep networks[C]//IJCAI.2016.
[10] HE X,PENG Y,XIE L.A New Benchmark and Approach for Fine-grained Cross-media Retrieval[C]//FGcross Net_ACMMM 2019.2019.
[11] HE K,ZHANG X,REN S,et al.Deep residual learning for imagerecognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[12] ZHU W,WANG T Q,CHEN Y F,et al.Object-level Edge Detection Algorithm Based on Multi-scale Residual Network[J].Computer Science,2020,47(6):144-150.
[13] ZHANG Y,LI K,LI K,et al.Image Super-Resolution UsingVery Deep Residual Channel Attention Networks[J].arXiv:1807.02758,2018.
[14] LIU S,BAI L,YU T Y,et al.Cross-media Semantic Similarity Measurement Using Bi-directional Learning Ranking[J].Computer Science,2017,44(S1):84-87,118.
[15] CAI J,MENG Z B,KHAN A S,et al.Island loss for learning discriminative features in facial expression recognition[C]//Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition.Los Alamitos:IEEE Computer Society Press,2018:302-309.
[16] FENG F X,WANG X J,LI R F.Cross-modal retrieval with correspondence utoencoder[C]//Proceedings of the 22nd ACM International Conference on Multimedia.Orlando,Florida,USA:ACM,2014:7-16.
[17] RASHTCHIAN C,YOUNG P,HODOSH M,et al.Collecting image annotations using Amazon's Mechanical Turk[C]//Proceeding of NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk.Los Angeles,California:ACM,2010:139-147.
[18] LIU Y,YU Z L,FU Q.Cross-media retravel method fusing with coupled dictionary learning and image regularization [J].Computer Engineering,2019,45(6):230-236.
[19] SUN Z Y.Research on Cross-media Retrival Method Based on Compression Convolutional Neural Networks[D].Wuahn:Central China Normal University,2020.
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[5] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[9] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[10] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[11] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[12] XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219.
[13] PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.
[14] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[15] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!