计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221100247-9.doi: 10.11896/jsjkx.221100247

• 图像处理&多媒体技术 • 上一篇    下一篇

基于改进D2Det尺度自适应目标检测算法研究

王玲, 黄冠, 王鹏, 白燕娥, 邱天衡   

  1. 长春理工大学计算机科学技术学院 长春 130022
  • 发布日期:2023-11-09
  • 通讯作者: 王玲(wangling0912@cust.edu.cn)
  • 基金资助:
    中央引导地方科技发展基金吉林省基础研究专项(202002038JC)

Study on Scale Adaptive Target Detection Algorithm Based on Improved D2Det

WANG Ling, HUANG Guan, WANG Peng, BAI Yane, QIU Tianheng   

  1. School of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China
  • Published:2023-11-09
  • About author:WANG Ling,born in 1979,Ph.D.Her main research interests include machine learning and image processing.
  • Supported by:
    Jilin Provincial Basic Research Project of the Central Leading local Science and Technology Development Fund(202002038JC).

摘要: 针对D2Det(Towards High Quality Object Detection and Instance Segmentation)面对尺度变化目标和小目标的检测效果不佳并且参数量较大的问题,基于D2Det提出一种尺度自适应的目标检测模型G-SAD2Det。首先在数据预处理阶段引入数据增强算法CutOut和Mosaic,使模型应对复杂场景时有较好的鲁棒性;其次改进特征提取网络ResNet,在每个残差块内构建多尺度特征提取结构,从细粒度层面上更好地提取目标特征,同时在网络结构上添加可切换的全局上下文语义特征提取模块,通过不同池化层来增强显著性特征和全局上下文语义信息;然后改进候选框生成模块,采用自主定位目标中心区域指导候选框的生成,增强算法对尺度变换目标的自适应能力;最后通过Ghost卷积替换普通卷积降低网络的参数量和计算量。使用VOC数据集和COCO子数据集验证算法的有效性,G-SAD2Det比D2Det在两个数据集上的mAP@0.5分别提升了3.6%和4.9%;模型参数量减少了27.42%,计算量减少了35.96%,证明改进后的算法在提高了精度的同时也减少了计算量。

关键词: 目标检测, 尺度自适应, 多尺度特征提取, 残差块, 区域指导候选框

Abstract: Aiming at the problem that D2Det(Towards High Quality Object Detection and Instance Segmentation) has poor detection effect and large parameter quantity in the face of scale change targets and small targets,this paper proposes a scale adaptive target detection model G-SAD2Det based on D2Det.Firstly,in the data preprocessing stage,the data enhancement algorithms CutOut and Mosaic are introduced,and the model has good robustness when dealing with complex scenes.Secondly,the feature extraction network ResNet is improved,the multi-scale feature extraction structure is built in each residual block,and the target features are better extracted from the fine-grained level.At the same time,the switchable global context semantic feature extraction module is added to the network structure,and the salience features and global context semantic information are enhanced through different pooling layers.Then,the candidate frame generation module is improved,and the center area of the self-locating target is used to guide the generation of the candidate frame,so that the adaptive ability of the algorithm to the scaling target can be enhanced.Finally,replacing ordinary convolution with Ghost convolution to reduce the amount of network parameters and computation.VOC data set and COCO sub-data set are used to verify the effectiveness of the algorithm,the mAP@0.5 value of G-SAD2Det increases by 3.6% and 4.9% respectively,compared with D2Det in the two data sets.The number of model parameters reduces by 27.42% and the amount of calculation reduces by 35.96%.It is proved that the improved algorithm not only improves the accuracy,but also reduces the amount of computation.

Key words: Object detection, Scale adaptive, Multi-scale feature extraction, Residual element, Regional guidance candidate box

中图分类号: 

  • TP391.41
[1]LOU G X,SHI H Z.Face image recognition based on convolu-tional neural network[J].China Communications,2020,17(2):117-124.
[2]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarc hies for accurate object detection and semantic segment ation[C]//Proceedings of the IEEE Conference on Com Puter Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2014:580-587.
[3]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learningfor image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2016:770-778.
[4]CAI Z W,VASCONCELOS N.Cascade r-cnn:Delving into high quality object detecti on[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Piscata way,NJ:IEEE Press,2018:6154-6162.
[5]LUO H L,CHEN H K.A Survey oftarget detection based ondeep learning[J].Acta Electronica Sinica,2020,48(6):1230-1239.
[6]ZHAO Y Q,JIA J L,GONG W J,et al.Multi scale aerial image target detection algorithm based on pro-YOLOv4[J].Computer Engineering & Science,2021,38(11):3466-3471.
[7]ZHANG R M,BI L J,WANG F B,et al.Target detection algorithm based on multi-scale feature fusion and adaptive anchor frame[J].Laser & Optoelectronics Progress,2022,59(12):420-429.
[8]LIU F,HAN X.Adaptive aerial target detection based on multi-scale depth learning[J].Acta Aeronautica et Astronautica Sinica,2022,43(5):471-482.
[9]LI Y Z,LIU H Z.Object Detection Based on Neighbour Feature Fusion[J].Computer Science,2021,48(12):264-268.
[10]CAO J L,CHOLAKKAL H,ANWER R M,et al.D2det:To-wards high quality object detection and instances segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2020:11485-11494.
[11]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Pisca-taway,NJ:IEEE Press,2017:2117-2125.
[12]ZHANG L,ZHOU B W,WU L H.SSD Network Based on Improved Convolutional Attention Module and Residual Structure[J].Computer Science,2022,49(3):211-217.
[13]YU X Y,WU S Y,LU X Q,et al.Adaptive multiscale feature for object detection[J].Neurocomputing,2021,449:146-158.
[14]ZENG N Y,WU P S,WANG Z D,et al.A smalll-sized object detection oriented multi-scale feature fusion approach with application to defect detection[J].IEEE Transactions on Instrumentation and Measurement,2022,71:1-14.
[15]DEVRIES T,TAYLOR G W.Improved regul arization of con-volutional neural networks with cutout[J].arXiv:1708.04552,2017.
[16]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.Yolov4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[17]GAO S H,CHENG M M,ZHAO K,et al.Res2net:A newmulti-scale bac kbone architecture[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(2):652-662.
[18]HAN K,WANG Y,TIAN Q,et al.Ghostnet:More featuresfrom cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1580-1589.
[19]WANG J Q,CHEN K,YANG S,et al.Region proposal byguided anchoring[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2019:2965-2974.
[20]HE Y H,ZHU C H,WANG J R,et al.Bounding box regression with uncertainty for accurate object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataw ay,NJ:IEEE Press,2019:2888-2897.
[21]LU J,LIN W,CHEN P,et al.Research on Lightweight Citrus Flowering Rate Statistical Model Combined with Anchor Frame Clustering Optimization[J].Sensors,2021,21(23):7929.
[22]ZHANG H,CISSE M,DAUPHIN Y N,et al.mixup:Beyondempirical risk minimization[J].arXiv:1710.09412,2017.
[23]YUN S,HAN D,OH S J,et al.Cutmix:Regu larization strategy to train strong classifiers with localizable features[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:6023-6032.
[24]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2018:7132-7141.
[25]WOO S,PARK J,LEE J Y,et al.Cbam:Con volutional block at-tention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).Berlin:Springer,2018:3-19.
[26]BAO Y X,LU T L,DU Y H,et al.Deepfake Videos Detection Method Based on iResNet34 Model and Data Aug mentation[J].Computer Science,2021,48(7):77-85.
[27]NAJIBI M,SINGH B,DAVIS L S.Fa-rpn:Floating region prop osals for face detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7723-7732.
[28]DAI J F,QI H Z,XIONG Y W,et al.Deformable convolutional networks[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.Piscataway,NJ:IEEE Press,2017:764-773.
[29]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for denseobject detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE Press,2017:2980-2988.
[30]LIU W,ANGUELOV D,ERHAN D,et al.SSd:Single shot mul tibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[31]LU X,LI B Y,YUE Y X,et al.Grid r-cnn[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2019:7363-7372.
[32]PANG J M,CHEN K,SHI J P,et al.Libra r-cnn:Towards ba-lanced learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2019:821-830.
[33]CHEN Q,WANG Y M,YANG T,et al.You only look one-level feature[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2021:13039-13048.
[34]QIAO S Y,CHEN L C,YUILLE A.Detectors:Detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2021:10213-10224.
[35]DUAN K W,BAI S,XIE L X,et al.Centernet:Keypoint triplets for object dete ction[C]//Proceedings of the IEEE/CVF Internati ONAL Conference on Computer Vision.Piscataway,NJ:IEEE Press,2019:6569-6578.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!