计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 231-237.doi: 10.11896/jsjkx.211100281
张卫良, 陈秀宏
ZHANG Weiliang, CHEN Xiuhong
摘要: 鉴于SSD(Single Shot Multibox Detector)不同层缺乏信息的交互以及模型感受野的限制,提出了一种改进的SSD目标检测算法——ESSD(Enhanced SSD),以提高目标检测的准确性。首先,使用SSD模型中原有的多尺度特征图,利用FPN(Feature Pyramid Networks)的思想,设计了一种跨层信息交互模块,在增强了不同层的语义信息能力的同时减小了不同层的信息差异。然后,为了提高模型的感受野和多尺度检测能力,设计了一种感受野扩增模块。最后,采用批处理归一化层缩短训练时间,以提高模型的收敛速度。为了评价ESSD的有效性,在PASCAL VOC2007测试集以及PASCAL VOC2012测试集上进行了实验。实验结果表明,在PASCAL VOC2007数据集上其mAP为82.1%且检测速度为15.7FPS,相比原有的SSD512,其mAP提升了2.3%;在PASCAL VOC2012测试集上其mAP达到了80.6%,也比SSD512高2.1%。实验证明了ESSD检测器在达到较高检测精度的情况下,仍然可以满足实时性。
中图分类号:
[1]LIU W,ANGUELOY D,ERHAN D,et al.Ssd:single shotmultibox detector[C]//European Conference on Computer Vision.2016:21-37. [2]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.California:IEEE Computer Society,2017:2117-2125. [3]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2018:8759-8768. [4]GHIASI G,LIN T Y,LE Q V.Nas-fpn:learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2019:7036-7045. [5]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2016:779-788. [6]REDMON J,FARUADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2017:6517-6525. [7]REDMON J,FARUADI A.Yolov3:an incremental improve-ment[J].arXiv:1804.02767,2018. [8]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2015. [9]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015. [10]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110. [11]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington:IEEE Computer Society,2005:886-893. [12]NOBLE W S.What is a support vector machine?[J].Nature Biotechnology,2006,24(12):1565-1567. [13]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems(NIPS).2012:1097-1105. [14]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252. [15]GIRSHICK R.,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).New Jersey:IEEE,2014:580-587. [16]UIJLINGSS J R R,VAJ DE S,GEVERS T,et al.Selectivesearch for object recognition[J].International Journal of Computer Vision,2013,104(2):154-171. [17]SERMANET P,EIGEN D,ZHANG X,et al.Overfeat:integra-ted recognition,localization and detection using convolutional networks[J].arXiv:1312.6229,2013. [18]EVERINGHAM M,ESLAMI S M A,VAN G L,et al.The pas-cal visual object classes challenge:a retrospective[J].International Journal of Computer Vision,2015,111(1):98-136. [19]GIRSHICK R B.Fast r-cnn[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision,Santiago.Wa-shington:IEEE Computer Society,2015:1440-1448. [20]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [21]FU C Y,LIU W,RANGA A,et al.Dssd:Deconvolutional single shot detector[J].arXiv:1701.06659,2017. [22]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2016:770-778. [23]LIU S T,HUANG D,WANG Y H.Receptive field block net for accurate and fast object detection[C]//European Conference on Computer Vision.2018:404-419. [24]JEONG J,PARK H,KWAK N.Enhancement of ssd by concatenating feature maps for object detection[C]//British Machine Vision Conference.2017. [25]HOLSCHNEIDER M,KRONLAND-MARTINET R,MORLET J,et al.A real-time algorithm for signal analysis with the help of the wavelet transform[M]//Wavelets.Berlin,Heidelberg:Springer,1990:286-297. [26]SRIVSTAVA N,HINTON G,KRIZHEVSKY A,et al.Dro-pout:a simple way to prevent neural networks from verfitting[J].Journal of Machine Learning Research,2014,15(6):1929-1958. [27]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//2017 IEEE International Conference on Computer Vision(ICCV).New Jersey:IEEE,2017:2999-3007. [28]ZHANG P,ZHONG Y,LI X.Slimyolov3:narrower,faster and better for real-time uav applications[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.New Jersey:IEEE,2019:37-45. [29]CHEN K,WANG J,PANG J,et al.Mmdetection:open mmlab detection toolbox and benchmark[J].arXiv:1906.07155,2019. [30]LI S P,LI C L,HAN J B,et al.Application of Binocular Vision Single Step Multi-target Detection Method for Robot Grasping[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2021,38(5):68-74. [31]ZOU H H,HOU J.Research on Road Small Target Detection with Improved SSD Algorithm[J].Computer Engineering,2022,48(5):281-288. |
[1] | 白雪飞, 马亚楠, 王文剑. 基于特征融合的边缘引导乳腺超声图像分割方法 Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion 计算机科学, 2023, 50(3): 199-207. https://doi.org/10.11896/jsjkx.211200294 |
[2] | 刘航, 普园媛, 吕大华, 赵征鹏, 徐丹, 钱文华. 极化自注意力约束颜色溢出的图像自动上色 Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image 计算机科学, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149 |
[3] | 陈亮, 王璐, 李生春, 刘昌宏. 基于深度学习的可视化仪表板生成技术研究 Study on Visual Dashboard Generation Technology Based on Deep Learning 计算机科学, 2023, 50(3): 238-245. https://doi.org/10.11896/jsjkx.230100064 |
[4] | 陈真, 普园媛, 赵征鹏, 徐丹, 钱文华. 基于自适应门控信息融合的多模态情感分析 Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion 计算机科学, 2023, 50(3): 298-306. https://doi.org/10.11896/jsjkx.220100156 |
[5] | 华杰, 刘学亮, 赵烨. 基于特征融合的小样本目标检测 Few-shot Object Detection Based on Feature Fusion 计算机科学, 2023, 50(2): 209-213. https://doi.org/10.11896/jsjkx.220500153 |
[6] | 瞿中, 王彩云. 基于注意力机制和轻量级空洞卷积的混凝土路面裂缝检测 Crack Detection of Concrete Pavement Based on Attention Mechanism and Lightweight DilatedConvolution 计算机科学, 2023, 50(2): 231-236. https://doi.org/10.11896/jsjkx.211200290 |
[7] | 商迪, 吕彦锋, 乔红. 受人脑中记忆机制启发的增量目标检测方法 Incremental Object Detection Inspired by Memory Mechanisms in Brain 计算机科学, 2023, 50(2): 267-274. https://doi.org/10.11896/jsjkx.220900212 |
[8] | 蔡肖, 陈志华, 盛斌. 基于移位窗口金字塔Transformer的遥感图像目标检测 SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing 计算机科学, 2023, 50(1): 105-113. https://doi.org/10.11896/jsjkx.211100208 |
[9] | 黄泽南, 刘晓捷, 赵晨晖, 邓亚彬, 郭东辉. 类脑计算脉冲神经网络模型及其学习算法研究进展 Spiking Neural Network Model for Brain-like Computing and Progress of Its Learning Algorithm 计算机科学, 2023, 50(1): 229-242. https://doi.org/10.11896/jsjkx.220100058 |
[10] | 荣欢, 钱敏峰, 马廷淮, 孙圣杰. 基于先验知识图谱的多代理被遮挡目标类别推理模型 Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration 计算机科学, 2023, 50(1): 243-252. https://doi.org/10.11896/jsjkx.220700112 |
[11] | 魏恺轩, 付莹. 基于重参数化多尺度融合网络的高效极暗光原始图像降噪 Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising 计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179 |
[12] | 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉. 基于边框距离度量的增量目标检测方法 Incremental Object Detection Method Based on Border Distance Measurement 计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132 |
[13] | 王灿, 刘永坚, 解庆, 马艳春. 基于软标签和样本权重优化的Anchor Free目标检测算法 Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization 计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240 |
[14] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[15] | 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094 |
|