计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220500230-9.doi: 10.11896/jsjkx.220500230

• 图像处理&多媒体技术 • 上一篇    下一篇

回归收敛缩放混合的深度迭代复合缩放CNN目标检测算法

王国刚, 吴艳, 刘一博   

  1. 山西大学物理与电子工程学院 太原 030006
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 王国刚(kingguogang@sxu.edu.cn)
  • 基金资助:
    山西省自然科学基金(201901D111031)

Target Detection Algorithm Based on Compound Scaling Deep Iterative CNN by RegressionConverging and Scaling Mixture

WANG Guogang, WU Yan, LIU Yibo   

  1. College of Physics and Electronic Engineering,Taiyuan 030006,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:WANG Guogang,born in 1977,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include the image processing,computer vision,machine lear-ning and artificial intelligence.
  • Supported by:
    Natural Science Foundation of Shanxi Province,China(201901D111031).

摘要: 针对EfficientDet算法鲁棒性低、回归损失函数收敛性能差、标签边缘化问题,提出了回归收敛缩放混合的深度迭代复合缩放CNN目标检测算法。该算法采用2×2缩放混合正则化方法,增强训练样本,避免训练过拟合,提高模型泛化能力;利用完全交并比损失抑制冗余预测框,将中心点距离和纵横比作为边界框坐标预测的损失函数惩罚项,使卷积神经网络回归更准确,提高了收敛速度和定位精度;设置平滑参数,对边缘化标签分布和均匀分布加权求和生成标签平滑正则化分布,建立类标签平滑交叉熵损失,提高模型的标签容错率。实验结果表明,所提算法的均值平均精度为88.31%,网络模型参数个数为8.10×106,相比EfficientDet-D2算法,均值平均精度提高了3.29%,网络模型参数个数没有增加,相比YOLOv4,YOLOv3,SSD,Faster R-CNN和Fast R-CNN算法,均值平均精度分别提升了5.2%,10.71 %,14.01%,15.11% 和18.30 %,网络模型参数个数分别减少了55.94×106,52.91×106,16.09×106,55.18×106和53.11×106。所提目标检测模型,提高了检测准确度和F1得分;检测每张测试图片仅需0.73s,满足实时性要求。

关键词: 目标检测, EfficientDet, IOU, 标签平滑

Abstract: A novel algorithm named as target detection algorithm based on compound scaling deep iterative CNN by regression converging and scaling mixture is proposed to avoid the disadvantages of low robustness,label marginalization and poor convergence performance of the regression loss function in the EfficientDet algorithm.After utilizing the 2×2 scaling mixture regularization strategy to enhance the training samples,the proposed method avoids the over fitting and improves the generalization ability of the model.The convergence speed,the positioning accuracy and the CNN regression accuracy are improved,since the aspect ratio and the center distance are taken into account in the penalty items of the CIOU loss function that can predict the bounding frame coordinate and suppress the redundant boxes.The proposed method improves the label fault tolerance rate because the cross entropy loss with label smoothing for class is established after generating the label smoothing regularization distribution,which is a weighted sum of the marginal label distribution and the uniform distribution by setting the smoothing parameter.Experiments are performed on the PASCAL VOC 2007 and 2012 datasets,and the results show that while the number of the network model parameters remain unchanged,the mean average precision of the proposed algorithm reaches 88.31 %,which is 3.29% higher than that of the original network(EfficientDet-D2,84.12%).Compared with YOLOv4,YOLOv3,SSD,Faster R-CNN and Fast R-CNN,the mean average precision increases by 5.2%,10.71 %,14.01%,15.11% and 18.30 %,respectively,and the number of network model parameters is reduced by 55.94×106,52.91×106,16.09×106,55.18×106 and 53.11×106,respectively.Not only the algorithm improves the detection accuracy and the F1 score,but also it takes 0.73 s to detect each test image,which meets the real-time requirements during the detecting phase.

Key words: Object detection, EfficientDet, IOU, Label smoothing

中图分类号: 

  • TP391.4
[1]ARRINGTON M,ELBICH D,DAI J,et al.Introducing the female Cambridge face memory test-long form(F-CFMT+)[J].Behavior Research Methods,2022:1-14.
[2]YU J,HAO X,CUI Z,et al.Boosting Fairness for Masked Face Recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:1531-1540.
[3]SUNG C S,PARK J Y.Design of an intelligent video surveil-lance system for crime prevention:applying deep learning technology[J].Multimedia Tools and Applications,2021,80(26):34297-34309.
[4]KIM J S,KIM M G,PAN S B.A study on implementation ofreal-time intelligent video surveillance system based on embedded module[J].EURASIP Journal on Image and Video Proces-sing,2021,2021(1):1-22.
[5]XIAOFENG T.Ecological driving on multiphase trajectories and multiobjective optimization for autonomous electric vehicle platoon[J].Scientific Reports,2022,12(1):1-16.
[6]TIAN X,LIU J,MALLICK M,et al.Simultaneous detection and tracking of moving-target shadows in ViSAR imagery[J].IEEE Transactions on Geoscience and Remote Sensing,2020,59(2):1182-1199.
[7]LIU S,WANG S,LIU X,et al.Fuzzy detection aided real-time and robust visual tracking under complex environments[J].IEEE Transactions on Fuzzy Systems,2020,29(1):90-102.
[8]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object detection with discriminatively trained part-based models[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,32(9):1627-1645.
[9]FRANCOIS C.Xception:Deep learning with depth wise separable convolutions[C]//CVPR.2017:1800-1807.
[10]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition.New York,USA:IEEE,2016.779-788.
[11]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017.6517-6525.
[12]REDMON JOSEPH,FARHADI A.Yolov3:An incremental improvement[J].arXiv:1804.02767,2018.
[13]FU C Y,LIU W,RANGA A,et al.DSSD:Deconvolutional single shot detector[J].arXiv:1701.06659,2017.
[14]SHEN Z Q,LIU Z H,LI J G,et al.DSOD:Learning deeply supervised object detectors from scratch[C]//Proceedings of the IEEE International Conference on Computer Vision.New York,USA:IEEE,2017:1919-1927.
[15]GIRSHICK R,DONAHUE J,DARRELL T.Rich feature hie-rarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Columbus,USA:IEEE,2014:580-587.
[16]REN S,HE K M,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems.Cambridge USA:MIT Press,2015:91-99.
[17]MINGXING TAN,RUOMING PANG,QUOC V L E.EfficientDet:Scalable and efficient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.CVPR,2020:10778-10787.
[18]SANGDOO Y,DONGYOON H,SEONG J O,et al.Cut Mix:Regularization strategy to train strong classififiers with localizable features[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.ICCV,2019:6023-6032.
[19]ZHENG Z H,WANG P,LIU W,et al.Distance-IoU Loss:Faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artifificial Intelligence.AAAI,2020:12993-13000.
[20]CHRISTIAN S,VINCENT V,SERGEY I,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.CVPR,2016:2818-2826.
[21]TAN M,LE Q.Efficientnet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2019:6105-6114.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!