Computer Science ›› 2023, Vol. 50 ›› Issue (5): 170-176.doi: 10.11896/jsjkx.220400085

• Computer Graphics & Multimedia • Previous Articles     Next Articles

SSD Object Detection Algorithm with Residual Learning and Cyclic Attention

JIA Tianhao, PENG Li   

  1. Engineering Research Center of Internet of Things Technology Applications,School of IoT Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China
  • Received:2022-04-11 Revised:2022-09-13 Online:2023-05-15 Published:2023-05-06
  • About author:JIA Tianhao,born in 1996,postgra-duate.His main research interests include computer vision and deep lear-ning.
    PENG Li,born in 1967,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include visual Internet of Things,action recognition and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61873112,61802107) and Taizhou Development and Reform Commission Foundation Project(2106-331000-04-04-295510).

Abstract: To address the problem that the shallow feature semantic information generated in the feature pyramid of Single-Shot Detection is insufficient,resulting in poor performance of small object detection,an SSD object detection algorithm based on resi-dual learning with cyclic attention is proposed.Firstly,the backbone network uses Resnet101,which is more capable of learning,to extract valid feature information.The deep feature layer of the original feature pyramid is then fused with the shallow feature layer by constructing a lightweight one-way feature fusion block,and a new feature pyramid is generated,which in turn enriches the semantic information of the effective feature layer used for prediction.Finally,a new spatial pooling strategy is proposed and combined with jump connections in residual networks to form a cyclic attention module to introduce global contextual information and establish full image dependencies for local features.To address the imbalance in the number of difficult and easy samples,Focalloss is used as the regression loss function.Experimental results show that the average detection accuracy(mAP) of the algorithm is 79.7% on the PASCAL VOC public dataset,an improvement of 2.5 % over SSD.The mAP on the MS COCO public dataset is 30.0%,an improvement of 4.9 % over SSD.

Key words: Object detection, Residual learning, Deep learning, Attention mechanism, Feature fusion

CLC Number: 

  • TP391.4
[1]LI S P,LI C L,HAN J P,et al.Application of Binocular Vision Single Step Multi-target Detection Method for Robot Grasping[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2021,38(5):68-74.
[2]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolution-al neural networks[J].Advances in Neural Information Processing Systems,2012,25:1097-1105.
[3]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[4]WANG X,HAN T X,YAN S.An HOG-LBP human detector with partial occlusion handling[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:32-39.
[5]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[6]KONG T,SUN F,YAO A,et al.Ron:Reverse connection with objectness prior networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5936-5944.
[7]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shotmultibox detector[C]//Proceedings of the European Conference on Computer Vision.2016:21-37.
[8]FU C Y,LIN W,RANGA A,et al.DSSD:deconvolutional single shot detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[9]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[10]SINGH B,DAVIS L S.An analysis of scale invariance in object detection snip[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3578-3587.
[11]LI Z,ZHOU F.FSSD:feature fusion single shot multibox detector[J].arXiv:1712.00960,2017.
[12]YU X,WU S,LU X,et al.Adaptive multiscale feature for object detection[J].Neurocomputing,2021,449:146-158.
[13]ZHANG L,ZHOU B W,WU H L.SSD Network Based on Improved Convolutional Attention Module and Residual Structure[J].Computer Science,2022,49(3):211-217.
[14]MA Y,ZHANG S.Feature Selection Module for CNN Based Object Detector[J].IEEE Access,2021,9:69456-69466.
[15]HUANG D,CHEN Z,FENG X,et al.Object detection method based on graph convolution net under limited samples[J].Journal of Chongqing Institute of Technology University(Natural Science Edition),2022,36(6):172-180.
[16]ZHOU K X,ZUO Y B,GU Y M,et al.Method of Retail Commodity Target Detection Based on YOLO-GT Network[J].Journal of Chongqing Institute of Technology University(Natural Science Edition),2021,35(6):174-184.
[17]HU K,XU D,KAN J.Single-Shot Detection Based on CyclicAttention[J].IEEE Access,2021,9:50557-50569.
[18]WANG F S,CHEN J G,WANG Q S,et al.Multi-scale object detection algorithm based on adaptive context features[J].CAAI Transactions on Intelligent Systems,2021,17(2):276-285.
[19]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the 2020 IEEE Conference on ComputerVision and Pattern Recognition.Piscataway:IEEE,2020:2011-2023.
[20]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:510-519.
[21]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[22]HOU Q,ZHANG L,CHENG M M,et al.Strip pooling:Rethin-king spatial pooling for scene parsing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:4003-4012.
[23]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
[24]ZHOU P,NI B,GENG C,et al.Scale-transferrable object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:528-537.
[25]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[J].arXiv:1605.06409,2016.
[26]BELL S,ZITNICK C L,BALA K,et al.Inside-outside net:Detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2016:2874-2883.
[27]REN S,HE K,GIRSHICK R B,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[1] GAO Xiang, TANG Jiqiang, ZHU Junwu, LIANG Mingxuan, LI Yang. Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement [J]. Computer Science, 2023, 50(6A): 220700153-6.
[2] WANG Tianran, WANG Qi, WANG Qingshan. Transfer Learning Based Cross-object Sign Language Gesture Recognition Method [J]. Computer Science, 2023, 50(6A): 220300232-5.
[3] ZHANG Tao, CHENG Yifei, SUN Xinxu. Graph Attention Networks Based on Causal Inference [J]. Computer Science, 2023, 50(6A): 220600230-9.
[4] CUI Lin, CUI Chenlu, LIU Zhengwei, XUE Kai. Speech Emotion Recognition Based on Improved MFCC and Parallel Hybrid Model [J]. Computer Science, 2023, 50(6A): 220800211-7.
[5] DUAN Jianyong, YANG Xiao, WANG Hao, HE Li, LI Xin. Document-level Relation Extraction of Graph Attention Convolutional Network Based onInter-sentence Information [J]. Computer Science, 2023, 50(6A): 220800189-6.
[6] ZENG Wu, MAO Guojun. Few-shot Learning Method Based on Multi-graph Feature Aggregation [J]. Computer Science, 2023, 50(6A): 220400029-10.
[7] HOU Yanrong, LIU Ruixia, SHU Minglei, CHEN Changfang, SHAN Ke. Review of Research on Denoising Algorithms of ECG Signal [J]. Computer Science, 2023, 50(6A): 220300094-11.
[8] GU Yuhang, HAO Jie, CHEN Bing. Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion [J]. Computer Science, 2023, 50(6A): 220500001-6.
[9] YANG Xing, SONG Lingling, WANG Shihui. Remote Sensing Image Classification Based on Improved ResNeXt Network Structure [J]. Computer Science, 2023, 50(6A): 220100158-6.
[10] HAN Junling, LI Bo, KANG Xiaodong, YANG Jingyi, LIU Hanqing, WANG Xiaotian. Cardiac MRI Image Segmentation Based on Faster R-CNN and U-net [J]. Computer Science, 2023, 50(6A): 220600047-9.
[11] ZHANG Shunyao, LI Huawang, ZHANG Yonghe, WANG Xinyu, DING Guopeng. Image Retrieval Based on Independent Attention Mechanism [J]. Computer Science, 2023, 50(6A): 220300092-6.
[12] LIU Haowei, YAO Jingchi, LIU Bo, BI Xiuli, XIAO Bin. Two-stage Method for Restoration of Heritage Images Based on Muti-scale Attention Mechanism [J]. Computer Science, 2023, 50(6A): 220600129-8.
[13] QI Xuanlong, CHEN Hongyang, ZHAO Wenbing, ZHAO Di, GAO Jingyang. Study on BGA Packaging Void Rate Detection Based on Active Learning and U-Net++ Segmentation [J]. Computer Science, 2023, 50(6A): 220200092-6.
[14] WANG Guogang, WU Yan, LIU Yibo. Target Detection Algorithm Based on Compound Scaling Deep Iterative CNN by RegressionConverging and Scaling Mixture [J]. Computer Science, 2023, 50(6A): 220500230-9.
[15] LI Fan, JIA Dongli, YAO Yumin, TU Jun. Graph Neural Network Few Shot Image Classification Network Based on Residual and Self-attention Mechanism [J]. Computer Science, 2023, 50(6A): 220500104-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!