Computer Science ›› 2023, Vol. 50 ›› Issue (2): 123-129.doi: 10.11896/jsjkx.211200303
• Database & Big Data & Data Science • Previous Articles Next Articles
ZOU Yunzhu1, DU Shengdong1,2, TENG Fei1, LI Tianrui1,2
CLC Number:
[1]WU A M,JIANG P,HAN Y H.Survey of Cross-media Question Answering and Reasoning Based on Vision and Language [J].Computer Science,2021,48(3):71-78. [2]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:618-626. [3]DU H J,LIU X L.Image description generation method based on inhibitor learning [J].Journal of Image and Graphics,2020,25(2):333-342. [4]XU S K,NI C H,JI C C,et al.Image Caption of Safety Helmets Wearing in Construction Scene Based on YOLOv3 [J].Compu-ter Science,2020,47(8):233-240. [5]LEE K H,CHEN X,HUA G,et al.Stacked cross attention for image-text matching[C]//Proceedings of the European Confe-rence on Computer Vision(ECCV).2018:201-216. [6]ZHOU Y X,YU J.Design of Image Question and Answer System Based on Deep Learning [J].Computer Application and Software,2018,35(12):199-208. [7]ZHUANG M Q,TAN X H,FAN Y C,et al.3D Animation Expression Generation and Emotional Supervision Based on Convolutional Neural Network [J].Journal of Chongqing University of Technology(Natural Science),2022,36(01):151-158. [8]XU S,ZHU Y X.Study on Question Processing Algorithms in Visual Question Answering [J].Computer Science,2020,47(11):226-230. [9]ANTOL S,AGRAWAL A,LU J,et al.Vqa:Visual question answering[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:2425-2433. [10]ZHOU B,TIAN Y,SUKHBAATAR S,et al.Simple baseline for visual question answering [J].arXiv:1512.02167,2015. [11]MALINOWSKI M,FRITZ M.A multi-world approach to question answering about real-world scenes based on uncertain input [J].Advances in Neural Information Processing Systems,2014,27:1682-1690. [12]KAFLE K,KANAN C.Answer-type prediction for visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:4976-4984. [13]REN M,KIROS R,ZEMEL R.Exploring models and data for image question answering [J].Advances in Neural Information Processing Systems,2015,28:2953-2961. [14]LIN M Q,ZHANG X M.Identity Authentication of Multi-Modal Fusion Based on Behavioral Footprint[J].Computer Engineering,2021,47(10):116-124. [15]FUKUI A,PARK D H,YANG D,et al.Multimodal compact bilinear pooling for visual question answering and visual grounding[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:457-468. [16]KIM J H,LEE S W,KWAK D,et al.Multimodal residual lear-ning for visual qa[C]//Advances in Neural Information Proces-sing Systems.2016:361-369. [17]MENG X S,JIANG A W,LIU C H,et al.Visual Question Answering based on Spatial-DCTHash Dynamic Parameter Network [J].SCIENTIA SINICA Informationis,2017,47(8):1008-1022. [18]GU L,JI Y,LIU C P.Classification Method of Three-Dimensional Point Cloud Based on Multiple Modal Feature Fusion[J].Computer Engineering,2021,47(2):279-284. [19]LU J,YANG J,BATRA D,et al.Hierarchical question-imageco-attention for visual question answering [J].Advances in Neural Information Processing Systems,2016,29:289-297. [20]NGUYEN D K,OKATANI T.Improved fusion of visual andlanguage representations by dense symmetric co-attention for visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6087-6096. [21]YU Z,YU J,FAN J,et al.Multi-modal factorized bilinear pooling with co-attention learning for visual question answering[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1821-1830. [22]YAN R Y,LIU X L.Visual Question Answering Model Based on Bottom-up Attention and Memory Network [J].Journal of Image and Graphics,2020,25(5):993-1006. [23]WANG Y L,ZHUO Y F,WU Y J,et al.Question Answering Algorithm on Image Fragmentation Information Based on Deep Neural Network [J].Journal of Computer Research and Deve-lopment,2018,55(12):2600-2610. [24]CHEN C,HAN D,WANG J.Multimodal encoder-decoder attention networks for visual question answering [J].IEEE Access,2020,8:35662-35671. [25]FU P C,YANG G,LIU X M,et al.Visual Question Answering Model Based on Spatial Relation and Frequency Feature [J].Computer Engineering,2022,48(9):96-104. [26]ZOU P R,XIAO F,ZHANG W J,et al.Multi-Modele Co-Attention Network for Visual Question Answering [J].Computer Engineering,2022,48(2):250-260. [27]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149. [28]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].arXiv:1706.03762,2017. [29]LI L.Research on Collaborative Attention Model and Deep Correlated Networks for Visual Question Answer [D].Xiamen:Huaqiao University,2020. [30]NIU Y L,ZHANG H W.Survey on Visual Question Answering and Dialogue [J].Computer Science,2021,48(3):87-96. [31]YU Z,YU J,XIANG C,et al.Beyond bilinear:Generalized multimodal factorized high-order pooling for visual question answe-ring[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(12):5947-5959. |
[1] | BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207. |
[2] | LIU Hang, PU Yuanyuan, LYU Dahua, ZHAO Zhengpeng, XU Dan, QIAN Wenhua. Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image [J]. Computer Science, 2023, 50(3): 208-215. |
[3] | CHEN Liang, WANG Lu, LI Shengchun, LIU Changhong. Study on Visual Dashboard Generation Technology Based on Deep Learning [J]. Computer Science, 2023, 50(3): 238-245. |
[4] | ZHANG Yi, WU Qin. Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention [J]. Computer Science, 2023, 50(3): 246-253. |
[5] | YING Zonghao, WU Bin. Backdoor Attack on Deep Learning Models:A Survey [J]. Computer Science, 2023, 50(3): 333-350. |
[6] | LI Shuai, XU Bin, HAN Yike, LIAO Tongxin. SS-GCN:Aspect-based Sentiment Analysis Model with Affective Enhancement and Syntactic Enhancement [J]. Computer Science, 2023, 50(3): 3-11. |
[7] | CHEN Fuqiang, KOU Jiamin, SU Limin, LI Ke. Multi-information Optimized Entity Alignment Model Based on Graph Neural Network [J]. Computer Science, 2023, 50(3): 34-41. |
[8] | ZHOU Mingqiang, DAI Kailang, WU Quanwang, ZHU Qingsheng. Attention-aware Multi-channel Graph Convolutional Rating Prediction Model for Heterogeneous Information Networks [J]. Computer Science, 2023, 50(3): 129-138. |
[9] | DONG Yongfeng, HUANG Gang, XUE Wanruo, LI Linhao. Graph Attention Deep Knowledge Tracing Model Integrated with IRT [J]. Computer Science, 2023, 50(3): 173-180. |
[10] | HUA Xiaofeng, FENG Na, YU Junqing, HE Yunfeng. Shooting Event Detection of Free Kick in Soccer Video Based on Rule Reasoning [J]. Computer Science, 2023, 50(3): 181-190. |
[11] | MEI Pengcheng, YANG Jibin, ZHANG Qiang, HUANG Xiang. Sound Event Joint Estimation Method Based on Three-dimension Convolution [J]. Computer Science, 2023, 50(3): 191-198. |
[12] | WANG Pengyu, TAI Wenxin, LIU Fang, ZHONG Ting, LUO Xucheng, ZHOU Fan. Self-supervised Flight Trajectory Prediction Based on Data Augmentation [J]. Computer Science, 2023, 50(2): 130-137. |
[13] | GUO Nan, LI Jingyuan, REN Xi. Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning [J]. Computer Science, 2023, 50(2): 178-189. |
[14] | LI Junlin, OUYANG Zhi, DU Nisuo. Scene Text Detection with Improved Region Proposal Network [J]. Computer Science, 2023, 50(2): 201-208. |
[15] | HUA Jie, LIU Xueliang, ZHAO Ye. Few-shot Object Detection Based on Feature Fusion [J]. Computer Science, 2023, 50(2): 209-213. |
|