跨层融合和感受野扩增的SSD目标检测算法

doi:10.11896/jsjkx.211100281

Abstract

Abstract: In view of the lack of information interaction between different layers of single shot multibox detector(SSD) and the limitation of the model's receptive field,an improved SSD object detection algorithm,named ESSD(enhanced SSD),is proposed to improve the accuracy of object detection.First of all,using the original multi-scale feature map in the SSD model and using the idea of feature pyramid networks(FPN),a cross-layer information interaction module is designed,which enhances the semantic information capabilities of different layers and reduces the information difference of different layers.Then,in order to improve the receptive field and multi-scale detection capabilities of the model,a receptive field amplification module is designed.Finally,the batch normalization layer is used to reduce the training time and improve the convergence speed of the model.In order to evaluate the effectiveness of ESSD,experiments are conducted on the PASCAL VOC2007 and PASCAL VOC2012 test sets.Experimental results show that on the PASCAL VOC2007 data set,its mAP is 82.1% and the detection speed is 15.7FPS.Compared with the original SSD512,its mAP increases by 2.3%;on the PASCAL VOC2012 test set,its mAP reaches 80.6%,which is also 2.1% higher than SSD512.Experiments have proved that the ESSD detector can still meet the real-time performance under the condition of high detection accuracy.

Key words: Object detection, Information fusion, Receptive field, Multi-scale, SSD

CLC Number:

TP391

ZHANG Weiliang, CHEN Xiuhong. SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification[J].Computer Science, 2023, 50(3): 231-237.

References

[1]LIU W,ANGUELOY D,ERHAN D,et al.Ssd:single shotmultibox detector[C]//European Conference on Computer Vision.2016:21-37.
[2]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.California:IEEE Computer Society,2017:2117-2125.
[3]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2018:8759-8768.
[4]GHIASI G,LIN T Y,LE Q V.Nas-fpn:learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2019:7036-7045.
[5]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2016:779-788.
[6]REDMON J,FARUADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2017:6517-6525.
[7]REDMON J,FARUADI A.Yolov3:an incremental improve-ment[J].arXiv:1804.02767,2018.
[8]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2015.
[9]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015.
[10]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[11]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington:IEEE Computer Society,2005:886-893.
[12]NOBLE W S.What is a support vector machine?[J].Nature Biotechnology,2006,24(12):1565-1567.
[13]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems(NIPS).2012:1097-1105.
[14]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[15]GIRSHICK R.,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).New Jersey:IEEE,2014:580-587.
[16]UIJLINGSS J R R,VAJ DE S,GEVERS T,et al.Selectivesearch for object recognition[J].International Journal of Computer Vision,2013,104(2):154-171.
[17]SERMANET P,EIGEN D,ZHANG X,et al.Overfeat:integra-ted recognition,localization and detection using convolutional networks[J].arXiv:1312.6229,2013.
[18]EVERINGHAM M,ESLAMI S M A,VAN G L,et al.The pas-cal visual object classes challenge:a retrospective[J].International Journal of Computer Vision,2015,111(1):98-136.
[19]GIRSHICK R B.Fast r-cnn[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision,Santiago.Wa-shington:IEEE Computer Society,2015:1440-1448.
[20]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[21]FU C Y,LIU W,RANGA A,et al.Dssd:Deconvolutional single shot detector[J].arXiv:1701.06659,2017.
[22]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2016:770-778.
[23]LIU S T,HUANG D,WANG Y H.Receptive field block net for accurate and fast object detection[C]//European Conference on Computer Vision.2018:404-419.
[24]JEONG J,PARK H,KWAK N.Enhancement of ssd by concatenating feature maps for object detection[C]//British Machine Vision Conference.2017.
[25]HOLSCHNEIDER M,KRONLAND-MARTINET R,MORLET J,et al.A real-time algorithm for signal analysis with the help of the wavelet transform[M]//Wavelets.Berlin,Heidelberg:Springer,1990:286-297.
[26]SRIVSTAVA N,HINTON G,KRIZHEVSKY A,et al.Dro-pout:a simple way to prevent neural networks from verfitting[J].Journal of Machine Learning Research,2014,15(6):1929-1958.
[27]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//2017 IEEE International Conference on Computer Vision(ICCV).New Jersey:IEEE,2017:2999-3007.
[28]ZHANG P,ZHONG Y,LI X.Slimyolov3:narrower,faster and better for real-time uav applications[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.New Jersey:IEEE,2019:37-45.
[29]CHEN K,WANG J,PANG J,et al.Mmdetection:open mmlab detection toolbox and benchmark[J].arXiv:1906.07155,2019.
[30]LI S P,LI C L,HAN J B,et al.Application of Binocular Vision Single Step Multi-target Detection Method for Robot Grasping[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2021,38(5):68-74.
[31]ZOU H H,HOU J.Research on Road Small Target Detection with Improved SSD Algorithm[J].Computer Engineering,2022,48(5):281-288.

Related Articles 15

[1]	BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207.
[2]	CHEN Liang, WANG Lu, LI Shengchun, LIU Changhong. Study on Visual Dashboard Generation Technology Based on Deep Learning [J]. Computer Science, 2023, 50(3): 238-245.
[3]	CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua. Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion [J]. Computer Science, 2023, 50(3): 298-306.
[4]	HUA Jie, LIU Xueliang, ZHAO Ye. Few-shot Object Detection Based on Feature Fusion [J]. Computer Science, 2023, 50(2): 209-213.
[5]	SHANG Di, LYU Yanfeng, QIAO Hong. Incremental Object Detection Inspired by Memory Mechanisms in Brain [J]. Computer Science, 2023, 50(2): 267-274.
[6]	CAI Xiao, CEHN Zhihua, SHENG Bin. SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing [J]. Computer Science, 2023, 50(1): 105-113.
[7]	HUANG Zenan, LIU Xiaojie, ZHAO Chenhui, DENG Yabin, GUO Donghui. Spiking Neural Network Model for Brain-like Computing and Progress of Its Learning Algorithm [J]. Computer Science, 2023, 50(1): 229-242.
[8]	RONG Huan, QIAN Minfeng, MA Tinghuai, SUN Shengjie. Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration [J]. Computer Science, 2023, 50(1): 243-252.
[9]	WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[10]	LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui. Incremental Object Detection Method Based on Border Distance Measurement [J]. Computer Science, 2022, 49(8): 136-142.
[11]	WANG Can, LIU Yong-jian, XIE Qing, MA Yan-chun. Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization [J]. Computer Science, 2022, 49(8): 157-164.
[12]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[13]	LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[14]	WANG Xin-tong, WANG Xuan, SUN Zhi-xin. Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network [J]. Computer Science, 2022, 49(8): 314-322.
[15]	ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0