Computer Science ›› 2022, Vol. 49 ›› Issue (7): 106-112.doi: 10.11896/jsjkx.210500224
• Computer Graphics & Multimedia • Previous Articles Next Articles
ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin
CLC Number:
[1]PENG Y X,HUANG X,ZHAO Y Z.An Overview of Cross-media Retrieval:Concepts,Methodologies,Benchmarks and Challenges[J].IEEE Transactions on Circuits and Systems for Video Technology,2018,28(9):2372-2385. [2]LIU S,CHEN Z Z,LIU H Y,et al.User-videoCo-attentionNetwork for Personalized Micro-video Recommendation [C]//Proceedings of World Wide Web Conference.New York:ACM,2019:3020-3026. [3]SHANG S T,SHI M Y,SHANG W Q,et al.A Micro-video Recommendation System Based on Big Data [C]//Proceedings of International Conference on Computer and InformationScience.Okayama:IEEE,2016:1-5. [4]PENG Y X,HUANG X.Current Research Status and Prospects on Multimedia Content Understanding[J].Journal of Computer Research and Development,2019,56(1):183-208. [5]RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A newApproach to Cross-Modal Multimedia Retrieval [C]//Procee-dings of the 18th ACM International Conference on Multimedia.Florence,Italy:ACM Press,2010:251-260. [6]WANG T,LI M.Research on Comment Text Mining Based on LDA Model and Semantic Network[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2019,36(4):9-16. [7]YALE S,MOHAMMAD S.Polysemous Visual-SemanticEmbedding for Cross-Model Retrieval [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Cambridge:MIT Press,2019:1979-1988. [8]YAN F,MIKOLAJCZYK K.Deep Correlation for MatchingImages and Text [C]//International Conference on Computer Vision & Pattern Recognition(CVPR).Boston,MA:IEEE,2015:3441-3450. [9]PENG Y X,QI J W,YUANY X.CM-GANs:Cross-modalGenerative Adversarial Networks for Common Representation Learning[J].ACM Transactions on Multimedia Computing Communications and Applications,2017,15(1):22-31. [10]JIANG B,YANG J C,LV Z H,et al.Internet Cross-Media Retrieval Based on Deep Learning[J].Journal of Visual Communication and Image Representation,2017,48:356-366. [11]FROME A,CORRADO G S,SHLENS J,et al.DEVISE:A Deep Visual-Semantic Embedding Model [C]//Advances in Neural Information Processing Systems.ACM,2013:2121-2129. [12]GU J X,CAI J F,JOTY S R,et al.Look,Imagine and Match:Improving Textual-visual Cross-modal Retrieval with Generative Models [C]//Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition,Piscataway.NJ:IEEE,2018:7181-7189. [13]LV G J,CAO J J,ZHENG Q B,et al.Cross-Modal Entity Resolution Based on Co-Attentional Generative Adversarial Network [C]//International Conference on Multimedia Systems and Signal Processing.Guangzhou,China:ACM,2019:42-46. [14]PENG Y X,QI J W,ZHUO Y X.MAVA:Multi-Level Adaptive Visual-Textual Alignment by Cross-Media Bi-Attention Mechanism[J].IEEE Transactions on Image Processing,2020,29:2728-2741. [15]LI K P,ZHANG Y L,LI K,et al.Visual Semantic Reasoning for Image-Text Matching [C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul,South Korea:IEEE,2019:4654-4662. [16]YU Y J,KIM J,KIM G.A Joint Sequence Fusion Model for Video Question Answering and Retrieval [C]//Proceedings of the European Conference on Computer Vision.New York:ACM,2018,471-487. [17]DONG J F,LI X R,XU C X,et al.Dual Encoding for Zero-Example Video Retrieval [C]//Proceedings of the IEEE Confe-rence on Computer Visong and Pattern Recognition.Long Beach,CA,2019:9346-9355. [18]CHO K,GULCEHRE C,BOUGARES F,et al.Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation [C]//Conference on Empirical Methods in Natural Language Processing(EMNLP).Berlin:ACM,2014:1724-1734. [19]XU Y,LIU J P,XIAO Y H,et al.Phrase Mining in Ecommerce Based on Cooperative Training[J].Computer Engineering,2020,46(4):70-76,84. [20]CHEN S Z,ZHAO Y D,QIN J,et al.Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning [C]//Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA:IEEE,2020:10635-10644. [21]WANG B K,YANG Y,XU X,et al.Adversarial Cross-ModalRetrieval [C]//Proceedings of the ACM Multimedia.Mountain View California:ACM,2017:154-162. [22]XU J,MEI T,YAO T,et al.MSR-VTT:A Large Video Description Dataset for Bridging Video and Language [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV:ACM,2016:5288-5296. [23]WANG X,WU J W,CHEN J K,et al.VATEX:A Large-scale,High-quality Multilingual Dataset for Video-and-Language Research [C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul,South Korea:IEEE,2019:4580-4590. [24]ZOPH B,VASUDEVAN V,SHLENS J,et al.Learning Transferable Architectures for Scalable Image Recognition [C]//Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT:IEEE,2018:8697-8710. [25]KIROS R,SALAHUTDINOV R,RICHARD S Z.UnifyingVisual-Semantic Embeddings with Multimodal Neural Language Models [EB/OL].https://arxiv.org/pdf/1411.2539.pdf. [26]FARTASH F,DAVID J F,JAMIE R K,et al.VSE++:Improving Visual-Semantic Embeddings with Hard Negatives [C]//Proceedings of the British Machine Vision Conference.New York:ACM,2018:1589-1599. [27]MITHUN N C,LI JC,METZE F,et al.Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-text Retrie-val[C]//Proceedings of the 2018 ACM on International Confe-rence on Multimedia Retrival.Yokohama,Japan,2018:19-27. [28]DONG J F,LI X R,SNOEK C G.Predicting Visual Features from Text for Image and Video Caption Retrieval[J].IEEE Transactions on Multimedia,2018,20(12):3377-3388. |
[1] | RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207. |
[2] | ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63. |
[3] | DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145. |
[4] | ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161. |
[5] | XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182. |
[6] | JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335. |
[7] | ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119. |
[8] | SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177. |
[9] | YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236. |
[10] | WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48. |
[11] | JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186. |
[12] | XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219. |
[13] | PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247. |
[14] | ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39. |
[15] | ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105. |
|