Computer Science ›› 2021, Vol. 48 ›› Issue (4): 157-163.doi: 10.11896/jsjkx.200300146
• Computer Graphics & Multimedia • Previous Articles Next Articles
WANG Xi1, ZHANG Kai1, LI Jun-hui1, KONG Fang1, ZHANG Yi-tian2
CLC Number:
[1]FARHADI A,HEJRATI M M,SADEGHI M A,et al.Every Picture Tells a Story:Generating Sentences from Images[C]//Proceedings Part IV of the 11th European Conference on Computer Vision.Heraklion,Crete,Greece:Springer,2010:15-29. [2]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [3]DEHGHANI M,GOUWS S,VINYALS O,et al.UniversalTransformers[J].arXiv:1807.03819,2018. [4]KULKARNI G,PREMRAJ V,ORDONEZ V,et al.Babytalk:Understanding and generating simple image descriptions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(12):2891-2903. [5]VINYALS O,TOSHEV A,BENGIO S,et al.Show and tell:A neural image caption generator[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,MA,USA:IEEE,2015:3156-3164. [6]KARPATHY A,LI F F.Deep visual-semantic alignments for generating image descriptions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3128-3137. [7]MAO J H,XU W,YANG Y,et al.Deep captioning with multimo-dal recurrent neural networks(m-rnn)[J].arXiv:1412.6632,2014. [8]XU J,GAVVES E,FERNANDO B,et al.Guiding the long-short term memory model for image caption generation[C]//Procee-dings of the IEEE International Conference on Computer Vision.2015:2407-2415. [9]WU Q,SHEN C H,LIU L Q,et al.What value do explicit high level concepts have in vision to language problems?[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:203-212. [10]XU K,BA J,KIROS R,et al.Show,Attend and Tell:NeuralImage Caption Generation with Visual Attention[C]//Procee-dings of the 32nd International Conference on Machine Lear-ning.Lille,France:JMLR,2015:2048-2057. [11]LU J,XIONG C M,PARIKH D,et al.Knowing when to look:Adaptive attention via a visual sentinel for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:375-383. [12]CHEN L,ZHANG H W,XIAO J,et al.Sca-cnn:Spatial andchannel-wise attention in convolutional networks for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5659-5667. [13]LI X R,LANW Y,DONG J F,et al.Adding Chinese Captions to Images[C]//Proceedings of the 2016 Association for Computing Machinery(ACM) on International Conference on Multimedia Retrieval.New York,USA:ACM,2016:271-275. [14]SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions[C]//Proceedings of the 32nd International Conference on Machine Learning.Lille,France:JMLR,2015:1-9. [15]RENNIE S J,MARCHERET E,MROUEH Y,et al.Self-critical sequence training for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7008-7024. [16]ANDERSON P,HE X D,BUEHLER C,et al.Bottom-up and top-down attention for image captioning and visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6077-6086. [17]HE X,YANG Y,SHI B,et al.VD-SAN:Visual-Densely Semantic Attention Network for Image Caption Generation[J].Neurocomputing,2019,328:48-55. [18]LIU M,LI L,HU L,et al.Image caption generation with dual attention mechanism[J].Information Processing and Management,2020,57(2):102178. [19]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016. [20]RENNIE S J,MARCHERET E,MROUEH Y,et al.Self-critical Sequence Training for Image Captioning[J].arXiv:1612.00563,2016. [21]VEDANTAM R,ZITNICK C L,PARIKH D,et al.CIDEr:Consensus-based image description evaluation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:4566-4575. [22]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Springer,Cham,2014:740-755. [23]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a Method for Automatic Evaluation of Machine Translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computatio-nal Linguistics.Philadelphia,PA,USA:ACL,2002:311-318. [24]DENKOWSKI M,LAVI A.Meteor universal:Language specific translation evaluation for any target language[C]//Proceedings of the Ninth Workshop on Statistical Machine Translation.2014:376-380. [25]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out,Post-conference Workshop of ACL 2004.Barcelona,Spain,2004:74-81. [26]ANDERSON P,FERNANDO B,JOHNSON M,et al.Spice:Semantic propositional image caption evaluation[C]//European Conference on Computer Vision.Springer,Cham,2016:382-398. [27]HE K M,ZHANG X Y,RENS Q,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [28]REN S,HE K,GIRSHICK R,et al.Faster RCNN:Towardsreal-time object detection with region proposal networks[C]// Advances in Neural Information Processing Systems.2015:91-99. [29]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252. [30]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014. [31]WISEMAN S,RUSH A M.Sequence-to-sequence learning asbeam-search optimization[J].arXiv:1606.02960,2016. [32]IOFFE S,SZEGEDY C.Batch Normalization.Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]//Proceedings of the 32nd International Conference on Machine Learning.Lille,,2015:448-456. [33]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].Proceedings of NACL-HLT,2019,1:4171-4186. |
[1] | JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186. |
[2] | PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247. |
[3] | ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377. |
[4] | CHEN Zhang-hui, XIONG Yun. Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate [J]. Computer Science, 2022, 49(6): 180-186. |
[5] | ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325. |
[6] | YU Xin, LIN Zhi-liang. Novel Neural Network for Dealing with a Kind of Non-smooth Pseudoconvex Optimization Problems [J]. Computer Science, 2022, 49(5): 227-234. |
[7] | AN Xin, DAI Zi-biao, LI Yang, SUN Xiao, REN Fu-ji. End-to-End Speech Synthesis Based on BERT [J]. Computer Science, 2022, 49(4): 221-226. |
[8] | SHI Yu-tao, SUN Xiao. Conversational Comprehension Model for Question Generation [J]. Computer Science, 2022, 49(3): 232-238. |
[9] | LI Hao, CAO Shu-yu, CHEN Ya-qing, ZHANG Min. User Trajectory Identification Model via Attention Mechanism [J]. Computer Science, 2022, 49(3): 308-312. |
[10] | XIAO Ding, ZHANG Yu-fan, JI Hou-ye. Electricity Theft Detection Based on Multi-head Attention Mechanism [J]. Computer Science, 2022, 49(1): 140-145. |
[11] | HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258. |
[12] | ZENG You-yu, XIE Qiang. Fault Prediction Method Based on Improved RNN and VAR for Ship Equipment [J]. Computer Science, 2021, 48(6): 184-189. |
[13] | CHEN Qian, CHE Miao-miao, GUO Xin, WANG Su-ge. Recurrent Convolution Attention Model for Sentiment Classification [J]. Computer Science, 2021, 48(2): 245-249. |
[14] | LYU Ming-qi, HONG Zhao-xiong, CHEN Tie-ming. Traffic Flow Forecasting Method Combining Spatio-Temporal Correlations and Social Events [J]. Computer Science, 2021, 48(2): 264-270. |
[15] | ZHOU Xiao-shi, ZHANG Zi-wei, WEN Juan. Natural Language Steganography Based on Neural Machine Translation [J]. Computer Science, 2021, 48(11A): 557-564. |