计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 243-252.doi: 10.11896/jsjkx.220700112
荣欢1, 钱敏峰2, 马廷淮2, 孙圣杰2
RONG Huan1, QIAN Minfeng2, MA Tinghuai2, SUN Shengjie2
摘要: 目标检测(Object Detection)是计算机视觉中最为热门的方向之一,在军事、医疗等重要领域都有广泛运用。然而,大多数目标检测模型都只能对可见物体进行识别,日常生活中的图片往往存在被遮挡(不可见)的目标物体,现有目标检测模型对图片中的被遮挡目标难以表现出较理想的检测性能。为此,文中提出了一种基于图库先验知识图谱的多代理协作式图片被遮挡目标类别推理模型(IMG-KGR-MAC)。具体而言,1)IMG-KGR-MAC根据给定图库中所有图片的可见目标及其之间的位置关系构建全局先验知识图谱;同时,根据图片自身所含目标及其位置关系,为各图片分别建立图片知识图谱;各图片内被遮挡目标的信息均不计入全局先验知识图谱和图片自身知识图谱;2)采用DDPG(Deep Deterministic Policy Gradient)深度强化学习思想,构建两个相互协作的代理;代理1根据当前图片语义信息从全局先验知识图谱挑选出与被遮挡目标最为适配的“类别标签”,将其作为新实体节点加入到给定图片自身的知识图谱中;代理2根据代理1新加入的实体,从全局先验知识图谱中进一步挑选〈实体,关系〉,扩展与新实体节点相关联的图谱结构;3)代理1与代理2通过共享任务环境和在奖励值上建立通信,相互协作地按“图片被遮挡目标(实体)→关联图谱结构”以及“关联图谱结构→图片被遮挡目标(实体)”原理,开展正向与反向推理,从而有效估计出给定图片被遮挡目标最为可能的类别标签。实验结果表明,与现有相关方法相比,所提出的IMG-KGR-MAC模型可以学习到给定图片被遮挡目标与全局先验知识图谱之间的语义关系,有效克服了现有模型对被遮挡目标难以检测的弊端,对于被遮挡目标有良好的推理能力,在MR(Mean Rank)以及mAP(Mean Average Precision)等多项指标上都有超过20%的提升。
[1]JIANG S Q,MIN W Q,WANG S Hi.Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques[J].Journal of Computer Research and Development,2016,53(1):113-122. [2]HARIHARAN B,ARBELÁEZ P,GIRSHICK R,et al.Simultaneous detection and segmentation[C]// European Conference on Computer Vision.Springer,2014:297-312. [3]HARIHARAN B,ARBELAEZ P,GIRSHICK R,et al.Hypercolumns for object segmentation and fine-grained localization[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2015. [4]DAI J,HE K,SUN J.Instance-aware semantic segmentation via multi-task network cascades[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3150-3158. [5]HE K,GKIOXARI G,DOLLÁR P,et al.Mask r-cnn[C]//2017 IEEE International Conference on Computer Vision(ICCV).IEEE,2017:2980-2988. [6]KARPATHY A,LI F F.Deep visual-semantic alignments for generating image descriptions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:3128-3137. [7]XU K,BA J,KIROS R,et al.Show,attend and tell:Neuralimage caption generation with visual attention[C]//International Conference on Machine Learning,2015:2048-2057. [8]WU Q,SHEN C,WANG P,et al.Image captioning and visual question answering based on attributes and external knowledge[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(6):1367-1381. [9]KANG K,LI H,YAN J,et al.T-cnn:Tubelets with convolutional neural networks for object detection fromvideos[C]//IEEE Transactions on Circuits and Systems for Video Techno-logy.2018:2896-2907. [10]GE Y Z,LIU H,WANG Y,et al.Survey on Deep LearningImage Recognition in Dilemma of Small Samples[J].Journal of Software,2022,33(1):193-210. [11]CHEN C,QI F.Reviewon Development of Convolutional Neural Network and Its Applicationin Computer Vision[J].Computer Science,2019,46(3):63-73. [12]ZHANG S,GONG Y H,WANG J J.The Development of Deep Convolution Neural Netwok and Its Applications on Computer Version[J]. Chinese Journal of Computers,2019,42(3):453-482. [13]CHEN K Q,ZHU Z L,DENG X M,et al.Deep Learning for Multi-Scale Object Detection:A Survey[J].Journal of Software.2021,32(4):1201-1227. [14]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[C]//ICLR(Poster).2016. [15]LI Z,JIN X,GUAN S,et al.Path Reasoning over Knowledge Graph:A Multi-agent and Reinforcement Learning Based Me-thod[C]// 2018 IEEE International Conference on Data Mining Workshops(ICDMW).IEEE,2018. [16]QU M,TANG J.Probabilistic logic neural networks for reaso-ning[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:7712-7722. [17]FU C,CHEN T,QU M,et al.Collaborative Policy Learning for Open Knowledge Graph Reasoning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:2672-2681. [18]LIN X V,SOCHER R,XIONG C.Multi-Hop Knowledge Graph Reasoning with Reward Shaping[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:3243-3253. [19]XU H,JIANG C,LIANG X,et al.Spatial-aware graph relation network for large-scale object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9298-9307. [20]FANG Y,KUAN K,LIN J,et al.Object detection meets knowledge graphs[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence.2017:1661-1667. [21]XU H,JIANG C,LIANG X,et al.Reasoning-RCNN:UnifyingAdaptive Global Reasoning Into Large-Scale Object Detection[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2019. [22]MARINO K,SALAKHUTDINOV R,GUPTA A.The MoreYou Know:Using Knowledge Graphs for Image Classification[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2017:20-28. [23]JIANG C,XU H,LIANG X,et al.Hybrid knowledge routedmodules for large-scale object detection[C]//Proceedings of the 32nd International Conference on Neural Information Proces-sing Systems.2018:1559-1570. [24]WELLING M,KIPF T N.Semi-supervised classification withgraph convolutional networks[C]//International Conference on Learning Representations(ICLR 2017).2016. [25]DAS R,DHULIAWALA S,ZAHEER M,et al.Go for a Walk and Arrive at the Answer:Reasoning Over Paths in Knowledge Bases using Reinforcement Learning[C]//International Confe-rence on Learning Representations.2018:1-18. [26]ANG B,YIH S W,HE X,et al.Embedding Entities and Relations for Learning and Inference in Knowledge Bases[C]//Proceedings of the International Conference on Learning Representations(ICLR).2015:1-12. [27]TROUILLON T,WELBL J,RIEDEL S,et al.Complex embeddings for simple link prediction[C]//International Conference on Machine Learning.PMLR,2016:2071-2080. [28]VELIKOVI P,CUCURULL G,CASANOVA A,et al.GraphAttention Networks[C]// International Conference on Lear-ning Representations.2018. |
[1] | 张化祥 黄上腾. 多代理最优响应Q学习及收敛性证明 计算机科学, 2004, 31(4): 96-98. |