计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220700006-7.doi: 10.11896/jsjkx.220700006

• 图像处理&多媒体技术 • 上一篇    下一篇

基于改进Yolov4-tiny的轻量型目标检测算法

窦智1, 胡晨光1, 梁竞一1, 郑李明2, 刘国奇1   

  1. 1 河南师范大学计算机与信息工程学院 河南 新乡 453007;
    2 金陵科技学院机电工程学院 南京 211169
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 窦智(2015160@htu.edu.cn)
  • 基金资助:
    国家自然科学基金(U1904123,61901160)

Lightweight Target Detection Algorithm Based on Improved Yolov4-tiny

DOU Zhi1, HU Chenguang1, LIANG Jingyi1, ZHENG Liming2, LIU Guoqi1   

  1. 1 School of Computer and Information Engineering,Henan Normal University,Xinxiang,Henan 453007,China;
    2 School of Mechanical and Electrical Engineering,Jinling University of Science and Technology,Nanjing 211169,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:DOU Zhi,born in 1983,male,Ph.D,associate professor.His main research interests include image processing,pattern recognition.
  • Supported by:
    National Natural Science Foundation of China(U1904123,61901160).

摘要: 面向视频的深度学习算法运算复杂度较高,难以满足实时性要求,严重影响了其在边缘计算和实时系统中的应用。轻量化网络成为了研究热点之一,针对大型网络的轻量化网络显著降低了原网络的参数规模,提升了检测速度,但检测精度难以满足工业需求。针对上述问题,文中提出了一种改进的目标检测轻量化网络,在保持小参数规模的前提下,有效提高了检测性能。文中在YOLOv4-tiny骨干网络中添加VIT(Vision Transformer)结构,利用多头自注意力机制使网络可以提取更深层次的物体特征;使用简化后的Bi-FPN,将两检测通道改为三检测通道,增加注意力融合机制,提高模型对图片特征的利用率,提高网络对不同尺寸大小目标的检测精度;使用Ghost卷积替换传统卷积操作,降低网络计算复杂度,减少网络参数。在COCO数据集上进行实验,实验结果表明,在保持网络规模不变的情况下,改进后的算法相比YOLOv4-tiny原网络检测精度取得了明显提升,可同时满足边缘计算及实时系统对深度网络轻量化和准确度的要求。

关键词: 目标检测, 轻量化网络, 多头自注意力机制, 加权特征融合

Abstract: Video-oriented deep learning algorithms have high computational complexity and are difficult to meet real-time requirements,which seriously affects their applications in edge computing and real-time systems.Lightweight networks have become one of the research hotspots.Lightweight networks for large networks significantly reduce the scale of the original network parameters and improve the detection speed,but the detection accuracy is had to meet industrial needs.In view of the above problems,this paper proposes an improved lightweight target detection network,which can effectively improve the detection performance while maintaining a small parameter scale.In this paper,the vision transformer(VIT) structure is added to the YOLOv4-tiny backbone network,and the multi-head self-attention mechanism enables the network to extract deeper object features.Using the simplified Bi-FPN,the two detection channels are changed to three detection channels,and the attention mechanism is introduced in the feature map fusion node to improve the model's utilization of image features and the network’s detection accuracy for objects of different sizes.Using Ghost convolution to replace traditional convolution operations,so as to reduce network computational complexity and network parameters.Experimental results on the COCO dataset show that the improved algorithm has significantly improved the detection accuracy of the original YOLOv4-tiny network while keeping the network scale unchanged,it can simultaneously meet the requirements of edge computing and real-time systems for the lightweight and accuracy of deep networks.

Key words: Object detection, Lightweight network, Multi-head self-attention mechanism, Weighted feature fusion

中图分类号: 

  • TP391
[1]BOCHKOVSKIY A,WANG C Y,LIAO H.YOLOv4:OptimalSpeed and Accuracy of Object Detection[J/OL].arXiv:2004.10934,2020.
[2]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv e-prints,2018.
[3]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2017:6517-6525.
[4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[5]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands.Springer International Publishing,2016:21-37.
[6]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[7]HE K,GKIOXARI G,P DOLLÁR,et al.Mask R-CNN][C]//IEEE Transactions on Pattern Analysis & Machine Intelligence.IEEE,2017.
[8]LIU S,QI L,QIN H,et al.Path Aggregation Network for Instance Segmentation][C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018.
[9]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.Scaled-yolov4:Scaling cross stage partial network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13029-13038.
[10]WANG B,LE H X,LI W J,et al.Improved mask detection algorithm of YOLO lightweight network[J].Computer Engineering and Applications,2021,57(8):62-69.
[11]ZHANG X,ZHANG Y Q,HE B,et al.Research on remotesensing image aircraft target detection technology based on YOLOv4-tiny[J].Optical Technology,2021,47(3):344-351.
[12]ZHANG X,WAN T,WU Z,et al.Real-time detector design for small targets based on bi-channel feature fusion mechanism[J].Applied Intelligence,2022,52(3):2775-2784.
[13]LU D,MA W Q.Gesture recognition based on improvedYOLOv4-tiny algorithm[J].Journal of Electronics and Information,2021,43(11):3257-3265.
[14]TIAN Y,MAO W,YUAN S,et al. A Decision Support System for Power Components Based on Improved YOLOv4-Tiny[J].Scientific Programming,2021,2021:1-11.
[15]LIN Y,CAI R,LIN P,et al.A detection approach for bundled log ends using K-median clustering and improved YOLOv4-Tiny network[J].Computers and Electronics in Agriculture,2022,194:106700.
[16]WANG L,ZHOU K,CHU A,et al.An improved light-weight traffic sign recognition algorithm based on YOLOv4-tiny[J].IEEE Access,2021,9:124963-124971.
[17]HUI T,XU Y L,JARHINBEK R.Detail texture detectionbased on Yolov4-tiny combined with attention mechanism and bicubic interpolation[J].IET Image Processing,2021,15(12):2736-2748.
[18]GUO C,LV X,ZHANG Y,et al.Improved YOLOv4-tiny network for real-time electronic component detection[J].Scientific Reports,2021,11(1):22744.
[19]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[20]TAN M,PANG R,LEQ V.EfficientDet:Scalable and Efficient Object Detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2020.
[21]HAN K,WANG Y,TIAN Q,et al.GhostNet:More Features From Cheap Operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2020.
[22]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[J].arXiv:1706.03762,2017.
[23]HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-23.
[24]HENDRYCKS D,GIMPEL K.Gaussian Error Linear Units(GELUs)[J].arXiv:1606.08415,2016.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!