Computer Science ›› 2025, Vol. 52 ›› Issue (9): 269-275.doi: 10.11896/jsjkx.240700136
• Computer Graphics & Multimedia • Previous Articles Next Articles
WANG Yuanlong, ZHANG Ningqian, ZHANG Hu
CLC Number:
[1]WANG R,WEI Z,LI P,et al.Story telling from an ImageStream Using Scene Graphs[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:9185-9192. [2]HSU C C,CHEN Z Y,HSU C Y,et al.Knowledge-Enriched Visual Storytelling[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:7952-7960. [3]LI M M,JIANG A W,LONG Y Z,et al.Visual story generation algorithm based on fine-grained visual features and knowledge graph[J].Journal of Chinese Information Technology,2022,36(9):139-148. [4]GU J,WANG H,FAN R.Coherent Visual Storytelling viaParallel Top-Down Visual and Topic Attention[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,33(1):257-268. [5]LIU D,LAPATA M,KELLER F.Visual Storytelling withQuestion-Answer Plans[M]//Findings of the Association for Computational Linguistics:EMNLP 2023.ACL,2023:5800-5813. [6]CHENG S,GUO Z,WU J,et al.Ego Think:Evaluating First-Person Perspective Thinking Capability of Vision-Language Models[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2024:14291-14302. [7]HUANG T H,FERRARO F,MOSTAFAZAD EH N,et al.Visual storytelling[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1233-1239. [8]KIM T,HEO M O,SON S,et al.GLAC Net:GLocal Attention Cascading Networks for Multi-image Cued Story Generation[J].arXiv:1805.10973,2018. [9]WANG J,FU J,TANG J,et al.Show,Reward and Tell:Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training[C]//AAAI Conference on Artificial Intelligence.2018:7396-74003. [10]CHEN H,HUANG Y,TAKAMURA H,et al.Commonsenseknowledge aware concept selection for diverse and informative visual storytelling[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:999-1008. [11]HSU C Y,CHU Y W,HUANG T H K,et al.Plot and Rework:Modeling Storylines for Visual Storytelling[C]//Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing.2021:4443-4453. [12]NARAYAN S,MAYNEZ J,AMPLAYO R K,et al.Conditional generation with a Question-Answering Blueprint[J].Transactions of the Association for Computational Linguistics,2023,11:974-996. [13]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149 [14]LI Z,YANG B,LIU Q,et al.Monkey:Image Resolution andText Label Are Important Things for Large Multi-modal Mo-dels[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2024:26753-26763. [15]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:A methodfor automatic evaluation of machine translation[C]//Procee-dings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318. [16]BANERJEE S,LAVIE A.METEOR:An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.2005:65-72. [17]LIN C Y,OCH F J.Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics.2004:21-26. [18]VEDANTAM R,ZITNICK C L,PARIKH D.C-IDEr:Consensus based image description evaluation[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:4566-4575. [19]LI J,GALLEY M,BROCKETT C,et al.A Diversity-Promoting objective function for neural conversation conversation Models[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:110-119. [20]CONNORS L H,LIM A,PROKAEVAT,et al.Tabulation of human transthyretin(TTR)variants[J].Amyloid,2003,10(3):160-84. [21]WANG X,CHEN W,WANG Y F,et al.No metrics are perfect:Adversarial reward learning for visual storytelling[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.ACL,2018:899-909 [22]WANG E,HAN S C,POON J.SCO-VIST:Social Interaction Commonsense Knowledge-based Visual Storytelling[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics.2024:1602-1616. |
[1] | XU Yutao, TANG Shouguo. External Knowledge Query-based for Visual Question Answering [J]. Computer Science, 2025, 52(6A): 240400101-8. |
[2] | XU Yutao, TANG Shouguo. Visual Question Answering Integrating Visual Common Sense Features and Gated Counting Module [J]. Computer Science, 2025, 52(6A): 240800086-7. |
[3] | HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan. Cross-modal Information Filtering-based Networks for Visual Question Answering [J]. Computer Science, 2024, 51(5): 85-91. |
[4] | LI Xiang, FAN Zhiguang, LI Xuexiang, ZHANG Weixing, YANG Cong, CAO Yangjie. Survey of Visual Question Answering Based on Deep Learning [J]. Computer Science, 2023, 50(5): 177-188. |
[5] | ZOU Yunzhu, DU Shengdong, TENG Fei, LI Tianrui. Visual Question Answering Model Based on Multi-modal Deep Feature Fusion [J]. Computer Science, 2023, 50(2): 123-129. |
[6] | WANG Ruiping, WU Shihong, ZHANG Meihang, WANG Xiaoping. Knowledge-based Visual Question Answering:A Survey [J]. Computer Science, 2023, 50(1): 166-175. |
[7] | YUAN De-sen, LIU Xiu-jing, WU Qing-bo, LI Hong-liang, MENG Fan-man, NGAN King-ngi, XU Lin-feng. Visual Question Answering Method Based on Counterfactual Thinking [J]. Computer Science, 2022, 49(12): 229-235. |
[8] | NIU Yu-lei, ZHANG Han-wang. Survey on Visual Question Answering and Dialogue [J]. Computer Science, 2021, 48(3): 87-96. |
[9] | XU Sheng, ZHU Yong-xin. Study on Question Processing Algorithms in Visual Question Answering [J]. Computer Science, 2020, 47(11): 226-230. |
|