计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 122-126.doi: 10.11896/jsjkx.201100026
冯姣, 陆昶谕
FENG Jiao, LU Chang-yu
摘要: 随着多媒体技术的快速发展,跨媒体检索逐渐替代传统的单媒体检索成为主流的信息检索方式。现有跨媒体检索方法复杂度高,且不能充分挖掘数据的细节特征,在映射的过程中会产生偏移,难以学习到精准的数据关联。针对上述问题,提出了一种基于残差注意力网络的跨媒体检索方法。首先,为了更好地提取不同媒体数据的关键特征,同时简化跨媒体检索模型,提出了融入注意力机制的残差神经网络。然后,提出了跨媒体检索联合损失函数,通过约束网络的映射过程,增强网络的语义辨别能力,提高网络检索精度。实验结果表明,与现有的一些方法对比,本文提出的基于残差注意力网络的跨媒体检索方法能够较好地学习到不同媒体数据之间的关联,有效地提高了跨媒体检索的精度。
中图分类号:
[1] QI J W,PENG Y X,YUAN Y X.Cross-media retrieval withhierarchical recurrent attention network[J].Journal of Image and Graphics,2018,23(11):1751-1758. [2] PENG Y X,QI J W,HUANG X.Current Research and Prospects on Multimedia Content Understanding[J].Journal of Computer Research and Development,2019,56(1):183-208. [3] ZHUO Y K,QI J W,PENG Y X.Cross-media deep fine-grained correlation learning[J].Ruan Jian Xue Bao/Journal of Software,2019,30(4):884-895. [4] HOTELLING H.Relation between two sets of variates [J].Biometrika,1936,28(3/4):321-377. [5] HARDOON D R,SZEDMAK S,SHAWE-TAYLOR J.Canonical Correlation Analysis:An Overview with Application to Learning Methods[J].Neural Computation,2004,16(12):2639-2664. [6] RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A New Approach to Cross-Modal Multimedia Retrieval [C]//International Conference on Multimedia.2010:251-260. [7] ZHANG B,HAO J,MA G,et al.Automatic image annotation based on semi-paired probabilistic canonical correlation anlysis[J].Ruan Jian Xue Bao/Journal of Software,2017,28(2):292-309. [8] ANDREW G,ARORA R,BILMES J,et al.Deep Canonical Correlation Analysis[C]//ICML.2013. [9] PENG Y X,HUANG X,QI J W.Cross-media shared representation by hierarchical learning with multiple deep networks[C]//IJCAI.2016. [10] HE X,PENG Y,XIE L.A New Benchmark and Approach for Fine-grained Cross-media Retrieval[C]//FGcross Net_ACMMM 2019.2019. [11] HE K,ZHANG X,REN S,et al.Deep residual learning for imagerecognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [12] ZHU W,WANG T Q,CHEN Y F,et al.Object-level Edge Detection Algorithm Based on Multi-scale Residual Network[J].Computer Science,2020,47(6):144-150. [13] ZHANG Y,LI K,LI K,et al.Image Super-Resolution UsingVery Deep Residual Channel Attention Networks[J].arXiv:1807.02758,2018. [14] LIU S,BAI L,YU T Y,et al.Cross-media Semantic Similarity Measurement Using Bi-directional Learning Ranking[J].Computer Science,2017,44(S1):84-87,118. [15] CAI J,MENG Z B,KHAN A S,et al.Island loss for learning discriminative features in facial expression recognition[C]//Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition.Los Alamitos:IEEE Computer Society Press,2018:302-309. [16] FENG F X,WANG X J,LI R F.Cross-modal retrieval with correspondence utoencoder[C]//Proceedings of the 22nd ACM International Conference on Multimedia.Orlando,Florida,USA:ACM,2014:7-16. [17] RASHTCHIAN C,YOUNG P,HODOSH M,et al.Collecting image annotations using Amazon's Mechanical Turk[C]//Proceeding of NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk.Los Angeles,California:ACM,2010:139-147. [18] LIU Y,YU Z L,FU Q.Cross-media retravel method fusing with coupled dictionary learning and image regularization [J].Computer Engineering,2019,45(6):230-236. [19] SUN Z Y.Research on Cross-media Retrival Method Based on Compression Convolutional Neural Networks[D].Wuahn:Central China Normal University,2020. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[3] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[4] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[5] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[6] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[7] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[8] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[9] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[10] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[11] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[12] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[13] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[14] | 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强. 基于向量注意力机制GoogLeNet-GMP的行人重识别方法 Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism 计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198 |
[15] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
|