计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 202-211.doi: 10.11896/jsjkx.240500042

• 计算机图形学&多媒体 • 上一篇    下一篇

面向无人机航拍图像的YOLOv8目标检测改进算法

胡惠娟1, 秦一锋1, 徐鹤1,2, 李鹏1,2   

  1. 1 南京邮电大学计算机学院、软件学院、网络空间安全学院 南京 210023
    2 江苏省高性能计算与智能处理工程研究中心 南京 210023
  • 收稿日期:2024-05-10 修回日期:2025-02-17 出版日期:2025-04-15 发布日期:2025-04-14
  • 通讯作者: 徐鹤(xuhe@njupt.edu.cn)
  • 基金资助:
    国家重点研发计划(2019YFB2103003)

An Improved YOLOv8 Object Detection Algorithm for UAV Aerial Images

HU Huijuan1, QIN Yifeng1, XU He1,2and LI Peng1,2   

  1. 1 School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
    2 Jiangsu HPC and Intelligent Processing Engineer Research Center,Nanjing 210023,China
  • Received:2024-05-10 Revised:2025-02-17 Online:2025-04-15 Published:2025-04-14
  • About author:HU Huijuan,born in 1975,master,lecturer.Her main research interests include artificial intelligence and Internet of Things technology.
    XU He,born in 1985,Ph.D,professor,master supervisor,is a senior member of CCF(No.19957S).His main research interests include big data and Internet of Things technology.
  • Supported by:
    National Key Research and Development Program of China(2019YFB2103003).

摘要: 针对无人机视角下航拍图像目标检测中存在的目标尺度变化多样、背景复杂、小目标聚集以及无人机平台计算资源受限等问题,提出了一种改进YOLOv8目标检测算法YOLOv8-CEBI。首先,在骨干网络引入轻量级Context Guided 模块,显著降低模型参数量与计算量,同时引入多尺度注意力机制EMA,捕获细粒度空间信息,提升对小目标和在复杂背景下的检测能力。其次,引入加权双向特征金字塔网络BiFPN,对颈部进行改造,在保持参数成本的前提下,增强多尺度特征融合能力。最后利用Inner-CIOU损失函数生成辅助边框以更精准地计算损失并加速边界框回归过程。在VisDrone数据集上进行实验,结果表明,与原始YOLOv8s算法相比,改进方法参数量减少51.3%,运算量减少28.5%,mAP50提升1.6%。所提模型在轻量化的同时提升了精度,取得了在减少计算资源与保证精度之间的平衡。

关键词: 无人机, 航拍图像, 注意力机制, 损失函数, 轻量化

Abstract: Aiming at the problems of diverse target scales,complex backgrounds,small target aggregation,and limited computing resources of drone platforms target detection of aerial images,an improved YOLOv8 target detection algorithm YOLOv8-CEBI is proposed.Firstly,a lightweight Context Guided module is introduced into the backbone network to significantly reduce the number of model parameters and computation.At the same time,a multi-scale attention mechanism EMA is introduced to capture fine-grained spatial information and improve the detection ability for small targets and complex backgrounds.Secondly,the weighted bidirectional feature pyramid network BIFPN is introduced to transform the neck,and the multi-scale feature fusion ability is enhanced under the premise of maintaining the parameter cost.Finally,the Inner-CIOU loss function is used to generate the auxi-liary bounding box to calculate the loss more accurately and accelerate the bounding box regression process.Experiments on the VisDrone dataset show that compared with the original YOLOv8s algorithm,the proposed method parameter amount is reduced by 51.3 %,the computation amount is reduced by 28.5 %,and the mAP50 is increased by 1.6 %.The proposed model ensures the improvement of accuracy and achieves a balance between reducing computing resources and ensuring accuracy.

Key words: Drones, Aerial images, Attention mechanism, Loss function, Lightweighting

中图分类号: 

  • TP391
[1]LENG J X,MO M C,ZHOU Y H,et al.Research progress on target detection from the perspective of UAV [J].Chinese Journal of Image and Graphics,2023,28(9):2563-2586.
[2]DUAN Z J,LI S B,HU J J,et al.Review of Deep LearningBased Object Detection Methods and Their Mainstream Frameworks[J].Laser & Optoelectronics Progress,2020(12):59-74.
[3]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Columbus:IEEE,2014:580-587.
[4]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[5]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:To-wards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[6]HE K M,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[J].arXiv:1703.06870,2017.
[7]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of the 14th European Conference on Computer Vision(ECCV).Amsterdam:Springer,2016:21-37.
[8]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition(CVPR).Las Vegas:IEEE,2016:779-788.
[9]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2017:6517-6525.
[10]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[11]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal Speed and Accuracy of Object Detection[J].arXiv:2004.10934,2020.
[12]ZHU X,LYU S,WANG X,et al.TPH-YOLOv5:ImprovedYOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios[J].arXiv:2108.11539,2021.
[13]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).Venice:IEEE,2017:2980-2988.
[14]WANG J,ZHANG F,ZHANG Y,et al.Lightweight Object Detection Algorithm for UAV Aerial Imagery[J].Sensors,2023,23:5786.
[15]NIU Y,CHENG W,SHI C,et al.YOLOv8-CGRNet:A Lightweight Object Detection Network Leveraging Context Guidance and Deep Residual Learning[J].Electronics.2024,13(1):43.
[16]PAN X,CHEN Q B,HUANG A,et al.Small target detection algorithm for UAV aerial images based on improved YOLOX [J].Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2024,44(1):90-100.
[17]GONG Y,YU X,DING Y,et al.Effective Fusion Factor in FPN for Tiny Object Detection[C]//Workshop on Applications of Computer Vision.IEEE,2021.
[18]LI Y,FAN Q,HUANG H,et al.A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J].Drones,2023,7(5):304.
[19]WU T,TANG S,ZHANG R,et al.CGNet:A Light-WeightContext Guided Network for Semantic Segmentation[J].IEEE Transactions on Image Processing,2021,30:1169-1179.
[20]OUYANG D,HE S,ZHANG G Z,et al.Efficient Multi-Scale Attention Module with Cross-Spatial Learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2023).Rhodes Island,Greece,2023:1-5.
[21]TAN M X,PANG R M,LE Q V.EfficientDet:Scalable and Efficient Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:10781-10790.
[22]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475.
[23]ZHANG H,XU C,ZHANG S.Inner-IoU:More effective intersection over union loss with auxiliary bounding box [J].arXiv:2311.02877,2024.
[24]DU D W,ZHU P F,WEN L Y,et al.VisDrone-DET2019:The Vision Meets Drone Object Detection in Image Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).Seoul:IEEE,2019:213-226.
[25]YAO P,SONKA M,CHEN D Z.U-Net v2:Rethinking the Skip Connections of U-Net for Medical Image Segmentation[J].ar-Xiv:2311.17791,2023.
[26]SUNKARA R,LUO T.No More Strided Convolutions or Pooling:A New CNN Building Block for Low-Resolution Images and Small Objects[J].Machine Learning and Knowledge Discovery in Databases,2023,13715:27.
[27]XU G P,LIAO W T,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819.
[28]WANG C Y,YEH I H,LIAO H Y M.YOLOv9:Learning What You Want to Learn Using Programmable Gradient Information[J].arXiv:2402.13616,2024.
[29]LU W,CHEN S B,TANG J,et al.A Robust Feature Downsampling Module for Remote-Sensing Visual Tasks[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-12.
[30]WANG C C,HE W,NIE Y,et al.Gold-yolo:Efficient object detector via gather-and-distribute mechanism[J].arXiv:2309.11331,2023.
[31]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6569-6578.
[32]CAI Z,VASCONCELOS N.Cascade R-CNN:Delving into highquality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6154-6162.
[33]LV W,XU S,ZHAO Y,et al.Detrs beat yolos on real-time object detection[J].arXiv:2304.08069,2023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!