计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 131-140.doi: 10.11896/jsjkx.241000017

• 计算机图形学&多媒体 • 上一篇    下一篇

基于特征增强与上下文融合的无人机小目标检测算法

陈崇杨, 彭力, 杨杰龙   

  1. 江南大学物联网工程学院物联网技术应用教育部工程研究中心 江苏 无锡 214122
  • 收稿日期:2024-10-08 修回日期:2024-12-17 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 彭力(penglimail2002@163.com)
  • 作者简介:(chenchongyang22@163.com)
  • 基金资助:
    国家自然科学基金(62106082,61873112);第九届中国科协青托工程项目(2023QNRC001)

UAV Small Object Detection Algorithm Based on Feature Enhancement and Context Fusion

CHEN Chongyang, PENG Li, YANG Jielong   

  1. Ministry of Education Engineering Research Center for the Application of Internet of Things Technology,School of Internet of Things Enginee-ring,Jiangnan University,Wuxi,Jiangsu 214122,China
  • Received:2024-10-08 Revised:2024-12-17 Online:2025-11-15 Published:2025-11-06
  • About author:CHEN Chongyang,born in 2000,postgraduate.His main research interests include object detection and computer vision.
    PENG Li,born in 1967,Ph.D,professor,Ph.D supervisor.His main research interests include computer vision,deep learning and visual Internet of Things.
  • Supported by:
    National Natural Science Foundation of China(62106082,61873112) and 9th China Association for Science and Technology Youth Support Project(2023QNRC001).

摘要: 针对无人机航拍视角下目标尺寸小、特征信息不足、分布密集以及因遮挡导致的检测精度低的问题,提出一种基于特征增强与上下文融合的无人机小目标检测算法。首先,构建增强特征提取的轻量化主干网络,利用特征提取轻量块高效提取特征信息,并设计细粒度通道融合块有效地避免目标细粒度特征的丢失,该主干网络提高了模型的特征提取能力和推理速度;其次,构建小目标检测头,充分提取小目标的位置信息和细节特征;然后,利用自适应选择空间注意力模块,自适应地调整不同目标所需的感受野,以充分利用航拍小目标周围丰富的上下文信息;最后,引入基于最小点距离的边界框回归损失函数MPDIoU,进一步提高密集小目标检测的精度。所提算法在VisDrone2019数据集上的mAP0.5和mAP0.5:0.95达到了46.7%和28.6%,较基准网络YOLOv8s分别提高了8.5%和5.9%;同时算法的参数量较YOLOv8s减少了23.4%,可高效适用于无人机航拍密集小目标检测场景。

关键词: 无人机, 小目标检测, 轻量级网络, 上下文信息, 注意力机制

Abstract: Aiming at the problems of low detection accuracy caused by small object sizes,insufficient feature information,dense distribution,and occlusion in UAV aerial photography,this paper proposes a UAV small object detection algorithm based on feature enhancement and context fusion.Firstly,a lightweight backbone network for enhanced feature extraction is constructed,utilizing lightweight feature extraction blocks to efficiently extract feature information,and a fine-grained channel fusion block is designed to effectively prevent the loss of fine-grained features.The backbone network improves the feature extraction capability and inference speed of the model.Secondly,a small object detection head is constructed to fully extract the position information and detailed features of small objects.Then,the adaptive spatial attention module is used to adaptively adjust the receptive fields required for different objects,making full use of the rich context information around the aerial small objects.Finally,a minimum point distance-based bounding box regression loss function(MPDIoU) is introduced to further improve the precision of dense small object detection.The proposed algorithm achieves mAP0.5 and mAP0.5:0.95 of 46.7% and 28.6% on the VisDrone2019 dataset,respectively,representing an improvement of 8.5% and 5.9% over the baseline network YOLOv8s.Moreover,the algorithm reduces parameters by 23.4% compared to YOLOv8s,making it efficient for dense small object detection in UAV aerial photography scenarios.

Key words: Unmanned Aerial Vehicle, Small object detection, Lightweight networks, Context information, Attention mechanism

中图分类号: 

  • TP391.4
[1]MITTAL P,SINGH R,SHARMA A.Deep learning-based object detection in low-altitude UAV datasets:A survey[J].Image and Vision computing,2020,104:104046.
[2]LIU Y,SUN P,WERGELES N,et al.A survey and performanceevaluation of deep learning methods for small object detection[J].Expert Systems with Applications,2021,172:114602.
[3]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shotmultibox detector[C]//Proceedings of the 14th European Conference on Computer Vision.Cham:Springer,2016:21-37.
[4]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[5]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[6]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7263-7271.
[7]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.Yolov4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[8]GLENN-JOCHER.YOLOv5[EB/OL].(2021-10-12)[2024-09-12].https://github.com/ultralytics/yolov5.
[9]GE Z,LIU S,WANG F,et al.Yolox:exceeding yolo series in 2021[J].arXiv:2107.08430,2021.
[10]LI C,LI L,JIANG H,et al.YOLOv6:a single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022.
[11]WANG C Y,BOCHKOVSKIY A,LIAO H Y.YOLOv7:trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475.
[12]GLENN-JOCHER.YOLOv8[EB/OL].(2023-09-27)[2024-09-12].https://github.com/ultralytics/ultralytics.
[13]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[14]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems.2015.
[15]ZHU X,LYU S,WANG X,et al.TPH-YOLOv5:improvedYOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:2778-2788.
[16]FENG Z Q,XIE Z J,BAO Z W,et al.Real-time dense small object detection algorithm for UAV based on improved YOLOv5[J].Acta Aeronautica et Astronautica Sinica,2023,44(7):251-265.
[17]YANG C,HUANG Z,WANG N.QueryDet:cascaded sparsequery for accelerating high-resolution small object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:13668-13677.
[18]XIE C H,WU J M,XU H Y.Small object detection algorithm based on improved YOLOv5 in UAV image[J].Computer Engineering and Applications,2023,59(9):198-206.
[19]ZHANG Y,WU C,ZHANG T,et al.Self-attention guidance and multiscale feature fusion-based UAV image object detection[J].IEEE Geoscience and Remote Sensing Letters,2023,20:1-5.
[20]DOU Z,HU C G,LIANG J Y,et al.Lightweight target detection algorithm based on improved Yolov4-tiny[J].Computer Science,2023,50(S1):484-490.
[21]ZHANG J,LEI J,XIE W,et al.SuperYOLO:super resolution assisted object detection in multimodal remote sensing imagery[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-15.
[22]SILIANG M,YONG X.MPDIoU:a loss for efficient and accurate bounding box regression[J].arXiv:2307.07662,2023.
[23]ZENG S,YANG W,JIAO Y,et al.SCA-YOLO:A new small object detection model for UAV images[J].The Visual Compu-ter,2024,40(3):1787-1803.
[24]ZHU L,WANG X,KE Z,et al.BiFormer:vision transformerwith bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:10323-10333.
[25]JIANG L,YUAN B,DU J,et al.MFFSODNet:Multi-Scale Feature Fusion Small Object Detection Network for UAV Aerial Images[J].IEEE Transactions on Instrumentation and Mea-surement,2024,73:5015214.
[26]ZHU J,WU Y,MA T.Multi-Object Detection for Daily Road Maintenance Inspection With UAV Based on Improved YOLOv8[J].IEEE Transactions on Intelligent Transportation Systems,2024,25(11):16548-16560.
[27]OUYANG D,HE S,ZHANG G,et al.Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023:1-5.
[28]MA S,LU H,LIU J,et al.LAYN:Lightweight multi-scale attention yolov8 network for small object detection[J].IEEE Access,2024,12:292924-29307.
[29]ZHOU Q,SHI H,XIANG W,et al.DPNet:Dual-path network for real-time object detection with lightweight attention[J].IEEE Transactions on Neural Networks and Learning Systems,2025,36(3):4504-4518.
[30]CHEN J,KAO S,HE H,et al.Run,Don't Walk:chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12021-12031.
[31]LIU Y,ZHANG K H,FAN J Q,et al.Progressively aggregating multi-scale scene context features for camouflaged object detection[J].Journal of Computer Science and Technology,2022,45(12):2637-2651.
[32]DU D,ZHU P,WEN L,et al.VisDrone-DET2019:the visionmeets drone object detection in image challenge results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.2019.
[33]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes(voc) challenge[J].International Journal of Computer Vision,2010,88:303-338.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!