Computer Science ›› 2022, Vol. 49 ›› Issue (10): 191-197.doi: 10.11896/jsjkx.220600009
• Computer Graphics& Multimedia • Previous Articles Next Articles
WANG Ming-zhan, JI Jun-zhong, JIA Ao-zhe, ZHANG Xiao-dan
CLC Number:
[1]VINYALS O,TOSHEV A,BENGIO S,et al.Show and tell:A neural image caption generator[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3156-3164. [2]XU K,BA J,KIROS R,et al.Show,attend and tell:Neuralimage caption generation with visual attention[C]//InternationalConference on Machine Learning.PMLR,2015:2048-2057. [3]ANDERSON P,HE X,BUEHLER C,et al.Bottom-up and top-down attention for image captioning and visual question answe-ring[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6077-6086. [4]TANG P J,WANG H L,XU K S.Multi-objective Layer-wise Optimization and Multi-level Probability Fus for Image Description Generation Using LSTM[J].Acta Automatica Sinica,2018,44(7):1237-1249. [5]YAO T,PAN Y,LI Y,et al.Exploring visual relationship for image captioning[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:684-699. [6]HERDADE S,KAPPELER A,BOAKYE K,et al.Image Captioning:Transforming Objects into Words[J].Advances in Neural Information Processing Systems,2019,32:11137-11147. [7]LI J.Deep Multimodal Attention Learning for Image Captioning[D].Hangzhou:Hangzhou Dianzi University,2020. [8]LI Z X,WEI H Y,HUANG F C,et al.Combine Visual Features and Scene Semantics for Image Captioning[J].Chinese Journal of Computers,2020,43(9):1624-1640. [9]CORNIA M,STEFANINI M,BARALDI L,et al.Meshed-me-mory transformer for image captioning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10578-10587. [10]RENNIE S J,MARCHERET E,MROUEH Y,et al.Self-critical sequence training for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7008-7024. [11]GUO L,LIU J,ZHU X,et al.Normalized and geometry-aware self-attention network for image captioning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10327-10336. [12]SPRATLING M W,JOHNSON M H.A feedback model of visualattention[J].Journal of Cognitive Neuroscience,2004,16(2):219-237. [13]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [14]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99. [15]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [16]KARPATHY A,LI F F.Deep visual-semantic alignments for generating image descriptions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3128-3137. [17]JIANG H,MISRA I,ROHRBACH M,et al.In defense of grid features for visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10267-10276. [18]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318. [19]LAVIE A,AGARWAL A.METEOR:An automatic metric for MT evaluation with high levels of correlation with human judgments[C]//Proceedings of the Second Workshop on Statistical Machine Translation.2007:228-231. [20]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out.2004:74-81. [21]VEDANTAM R,LAWRENCE Z C,PARIKH D.Cider:Consensus-based image description evaluation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:4566-4575. [22]ANDERSON P,FERNANDO B,JOHNSON M,et al.Spice:Semantic propositional image caption evaluation[C]//European Conference on Computer Vision.Cham:Springer,2016:382-398. [23]KINGMA D P,BA J.Adam:A method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations.2015:7-9. [24]JI J,XU C,ZHANG X,et al.Spatio-temporal memory attention for image captioning[J].IEEE Transactions on Image Proces-sing,2020,29:7615-7628. [25]YANG X,TANG K,ZHANG H,et al.Auto-encoding scenegraphs for image captioning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:10685-10694. [26]GUO L,LIU J,TANG J,et al.Aligning linguistic words andvisual semantic units for image captioning[C]//Proceedings of the 27th ACM International Conference on Multimedia.2019:765-773. [27]HUANG L,WANG W,CHEN J,et al.Attention on attention for image captioning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:4634-4643. [28]PAN Y,YAO T,LI Y,et al.X-linear attention networks forimage captioning[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:10971-10980. |
[1] | WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220. |
[2] | FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77. |
[3] | CHEN Kun-feng, PAN Zhi-song, WANG Jia-bao, SHI Lei, ZHANG Jin. Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation [J]. Computer Science, 2022, 49(8): 165-171. |
[4] | JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186. |
[5] | ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377. |
[6] | CHEN Zhang-hui, XIONG Yun. Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate [J]. Computer Science, 2022, 49(6): 180-186. |
[7] | ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325. |
[8] | HAN Jie, CHEN Jun-fen, LI Yan, ZHAN Ze-cong. Self-supervised Deep Clustering Algorithm Based on Self-attention [J]. Computer Science, 2022, 49(3): 134-143. |
[9] | FANG Zhong-jun, ZHANG Jing, LI Dong-dong. Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning [J]. Computer Science, 2022, 49(10): 151-158. |
[10] | HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258. |
[11] | HU De-feng, ZHANG Chen-xi, WANG Shi-tao, ZHAO Qin-pei, LI Jiang-feng. Intelligent Prediction Model of Tool Wear Based on Deep Signal Processing and Stacked-ResGRU [J]. Computer Science, 2021, 48(6): 175-183. |
[12] | WANG Xi, ZHANG Kai, LI Jun-hui, KONG Fang, ZHANG Yi-tian. Generation of Image Caption of Joint Self-attention and Recurrent Neural Network [J]. Computer Science, 2021, 48(4): 157-163. |
[13] | ZHOU Xiao-shi, ZHANG Zi-wei, WEN Juan. Natural Language Steganography Based on Neural Machine Translation [J]. Computer Science, 2021, 48(11A): 557-564. |
[14] | ZHANG Shi-hao, DU Sheng-dong, JIA Zhen, LI Tian-rui. Medical Entity Relation Extraction Based on Deep Neural Network and Self-attention Mechanism [J]. Computer Science, 2021, 48(10): 77-84. |
[15] | YU Wen-jia, DING Shi-fei. Conditional Generative Adversarial Network Based on Self-attention Mechanism [J]. Computer Science, 2021, 48(1): 241-246. |
|