计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 170-176.doi: 10.11896/jsjkx.220400085
贾天豪, 彭力
JIA Tianhao, PENG Li
摘要: 针对Single-Shot Detection的特征金字塔中生成的浅层特征语义信息不足,导致小目标检测性能较差的问题,提出了一种基于残差学习与循环注意力的SSD目标检测算法。首先主干网络采用学习能力更强的Resnet101来提取有效的特征信息;然后通过构建轻量级的单向特征融合块对原特征金字塔中的深特征层与浅特征层特征进行融合,并生成新的特征金字塔,进而丰富用于预测的有效特征层的语义信息;最后提出一种新的空间池化策略,并与残差网络中的跳跃连接相结合构成循环注意力模块,从而引入全局的上下文信息,为局部特征建立全局信息关联。为了解决难易样本数量不平衡的问题,将Focalloss作为回归损失函数。实验结果表明,在PASCAL VOC公共数据集上,该算法的平均检测精度(mAP)为79.7%,较SSD 提高了2.5%。在MS COCO公共数据集上的mAP为30.0%,较SSD 提高了4.9 %。
中图分类号:
[1]LI S P,LI C L,HAN J P,et al.Application of Binocular Vision Single Step Multi-target Detection Method for Robot Grasping[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2021,38(5):68-74. [2]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolution-al neural networks[J].Advances in Neural Information Processing Systems,2012,25:1097-1105. [3]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503. [4]WANG X,HAN T X,YAN S.An HOG-LBP human detector with partial occlusion handling[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:32-39. [5]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125. [6]KONG T,SUN F,YAO A,et al.Ron:Reverse connection with objectness prior networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5936-5944. [7]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shotmultibox detector[C]//Proceedings of the European Conference on Computer Vision.2016:21-37. [8]FU C Y,LIN W,RANGA A,et al.DSSD:deconvolutional single shot detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890. [9]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [10]SINGH B,DAVIS L S.An analysis of scale invariance in object detection snip[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3578-3587. [11]LI Z,ZHOU F.FSSD:feature fusion single shot multibox detector[J].arXiv:1712.00960,2017. [12]YU X,WU S,LU X,et al.Adaptive multiscale feature for object detection[J].Neurocomputing,2021,449:146-158. [13]ZHANG L,ZHOU B W,WU H L.SSD Network Based on Improved Convolutional Attention Module and Residual Structure[J].Computer Science,2022,49(3):211-217. [14]MA Y,ZHANG S.Feature Selection Module for CNN Based Object Detector[J].IEEE Access,2021,9:69456-69466. [15]HUANG D,CHEN Z,FENG X,et al.Object detection method based on graph convolution net under limited samples[J].Journal of Chongqing Institute of Technology University(Natural Science Edition),2022,36(6):172-180. [16]ZHOU K X,ZUO Y B,GU Y M,et al.Method of Retail Commodity Target Detection Based on YOLO-GT Network[J].Journal of Chongqing Institute of Technology University(Natural Science Edition),2021,35(6):174-184. [17]HU K,XU D,KAN J.Single-Shot Detection Based on CyclicAttention[J].IEEE Access,2021,9:50557-50569. [18]WANG F S,CHEN J G,WANG Q S,et al.Multi-scale object detection algorithm based on adaptive context features[J].CAAI Transactions on Intelligent Systems,2021,17(2):276-285. [19]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the 2020 IEEE Conference on ComputerVision and Pattern Recognition.Piscataway:IEEE,2020:2011-2023. [20]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:510-519. [21]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19. [22]HOU Q,ZHANG L,CHENG M M,et al.Strip pooling:Rethin-king spatial pooling for scene parsing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:4003-4012. [23]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [24]ZHOU P,NI B,GENG C,et al.Scale-transferrable object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:528-537. [25]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[J].arXiv:1605.06409,2016. [26]BELL S,ZITNICK C L,BALA K,et al.Inside-outside net:Detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2016:2874-2883. [27]REN S,HE K,GIRSHICK R B,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. |
[1] | 杨斌, 梁婧, 周佳薇, 赵梦赐. 基于注意力机制的可解释点击率预估模型研究 Study on Interpretable Click-Through Rate Prediction Based on Attention Mechanism 计算机科学, 2023, 50(5): 12-20. https://doi.org/10.11896/jsjkx.221000032 |
[2] | 李炳辉, 方欢, 梅振辉. 基于BERT和弱行为轮廓的可解释性事件日志修复方法 Interpretable Repair Method for Event Logs Based on BERT and Weak Behavioral Profiles 计算机科学, 2023, 50(5): 38-51. https://doi.org/10.11896/jsjkx.220900030 |
[3] | 陈冲, 陈杰, 张慧, 蔡磊, 薛亚茹. 深度学习可解释性综述 Review on Interpretability of Deep Learning 计算机科学, 2023, 50(5): 52-63. https://doi.org/10.11896/jsjkx.221000044 |
[4] | 黄迅迪, 庞雄文. 基于深度学习的智能设备故障诊断研究综述 Review of Intelligent Device Fault Diagnosis Based on Deep Learning 计算机科学, 2023, 50(5): 93-102. https://doi.org/10.11896/jsjkx.220500197 |
[5] | 王慧妍, 于明鹤, 于戈. 基于深度学习的异质信息网络表示学习方法综述 Deep Learning-based Heterogeneous Information Network Representation:A Survey 计算机科学, 2023, 50(5): 103-114. https://doi.org/10.11896/jsjkx.220800112 |
[6] | 王先旺, 周浩, 张明慧, 朱尤伟. 基于Swin Transformer和三维残差多层融合网络的高光谱图像分类 Hyperspectral Image Classification Based on Swin Transformer and 3D Residual Multilayer Fusion Network 计算机科学, 2023, 50(5): 155-160. https://doi.org/10.11896/jsjkx.220400035 |
[7] | 胡绍凯, 赫晓慧, 田智慧. 基于MLUM-Net的高分遥感影像土地利用多分类方法 Land Use Multi-classification Method of High Resolution Remote Sensing Images Based on MLUM-Net 计算机科学, 2023, 50(5): 161-169. https://doi.org/10.11896/jsjkx.220300110 |
[8] | 阳影, 张凡, 李天瑞. 基于情感知识的双通道图卷积网络的方面级情感分析 Aspect-based Sentiment Analysis Based on Dual-channel Graph Convolutional Network with Sentiment Knowledge 计算机科学, 2023, 50(5): 230-237. https://doi.org/10.11896/jsjkx.220300008 |
[9] | 张雪, 赵晖. 基于多事件语义增强的情感分析 Sentiment Analysis Based on Multi-event Semantic Enhancement 计算机科学, 2023, 50(5): 238-247. https://doi.org/10.11896/jsjkx.220400256 |
[10] | 雪峰豪, 蒋海波, 唐聃. 深度学习在健康医疗中的应用研究综述 Review of Deep Learning Applications in Healthcare 计算机科学, 2023, 50(4): 1-15. https://doi.org/10.11896/jsjkx.220600166 |
[11] | 韩雪明, 贾彩燕, 李轩涯, 张鹏飞. 传播树结构结点及路径双注意力谣言检测模型 Dual-attention Network Model on Propagation Tree Structures for Rumor Detection 计算机科学, 2023, 50(4): 22-31. https://doi.org/10.11896/jsjkx.220200037 |
[12] | 尹恒, 张凡, 李天瑞. 基于多邻接图与多头注意力机制的短期交通流量预测 Short-time Traffic Flow Forecasting Based on Multi-adjacent Graph and Multi-head Attention Mechanism 计算机科学, 2023, 50(4): 40-46. https://doi.org/10.11896/jsjkx.220200079 |
[13] | 雒晓辉, 吴云, 王晨星, 余文婷. 基于用户长短期偏好的序列推荐模型 Sequential Recommendation Model Based on User’s Long and Short Term Preference 计算机科学, 2023, 50(4): 47-55. https://doi.org/10.11896/jsjkx.220100264 |
[14] | 伍瀚, 聂佳浩, 张照娓, 何志伟, 高明煜. 基于深度学习的视觉多目标跟踪研究综述 Deep Learning-based Visual Multiple Object Tracking:A Review 计算机科学, 2023, 50(4): 77-87. https://doi.org/10.11896/jsjkx.220300173 |
[15] | 尹海涛, 王天由. 基于深度多尺度卷积稀疏编码的图像去噪算法 Image Denoising Algorithm Based on Deep Multi-scale Convolution Sparse Coding 计算机科学, 2023, 50(4): 133-140. https://doi.org/10.11896/jsjkx.220100090 |
|