面向无人机航拍图像的YOLOv8目标检测改进算法

doi:10.11896/jsjkx.240500042

Abstract

Abstract: Aiming at the problems of diverse target scales,complex backgrounds,small target aggregation,and limited computing resources of drone platforms target detection of aerial images,an improved YOLOv8 target detection algorithm YOLOv8-CEBI is proposed.Firstly,a lightweight Context Guided module is introduced into the backbone network to significantly reduce the number of model parameters and computation.At the same time,a multi-scale attention mechanism EMA is introduced to capture fine-grained spatial information and improve the detection ability for small targets and complex backgrounds.Secondly,the weighted bidirectional feature pyramid network BIFPN is introduced to transform the neck,and the multi-scale feature fusion ability is enhanced under the premise of maintaining the parameter cost.Finally,the Inner-CIOU loss function is used to generate the auxi-liary bounding box to calculate the loss more accurately and accelerate the bounding box regression process.Experiments on the VisDrone dataset show that compared with the original YOLOv8s algorithm,the proposed method parameter amount is reduced by 51.3 %,the computation amount is reduced by 28.5 %,and the mAP50 is increased by 1.6 %.The proposed model ensures the improvement of accuracy and achieves a balance between reducing computing resources and ensuring accuracy.

Key words: Drones, Aerial images, Attention mechanism, Loss function, Lightweighting

CLC Number:

TP391

HU Huijuan, QIN Yifeng, XU Heand LI Peng. An Improved YOLOv8 Object Detection Algorithm for UAV Aerial Images[J].Computer Science, 2025, 52(4): 202-211.

References

[1]LENG J X,MO M C,ZHOU Y H,et al.Research progress on target detection from the perspective of UAV [J].Chinese Journal of Image and Graphics,2023,28(9):2563-2586.
[2]DUAN Z J,LI S B,HU J J,et al.Review of Deep LearningBased Object Detection Methods and Their Mainstream Frameworks[J].Laser & Optoelectronics Progress,2020(12):59-74.
[3]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Columbus:IEEE,2014:580-587.
[4]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[5]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:To-wards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[6]HE K M,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[J].arXiv:1703.06870,2017.
[7]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of the 14th European Conference on Computer Vision(ECCV).Amsterdam:Springer,2016:21-37.
[8]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition(CVPR).Las Vegas:IEEE,2016:779-788.
[9]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2017:6517-6525.
[10]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[11]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal Speed and Accuracy of Object Detection[J].arXiv:2004.10934,2020.
[12]ZHU X,LYU S,WANG X,et al.TPH-YOLOv5:ImprovedYOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios[J].arXiv:2108.11539,2021.
[13]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).Venice:IEEE,2017:2980-2988.
[14]WANG J,ZHANG F,ZHANG Y,et al.Lightweight Object Detection Algorithm for UAV Aerial Imagery[J].Sensors,2023,23:5786.
[15]NIU Y,CHENG W,SHI C,et al.YOLOv8-CGRNet:A Lightweight Object Detection Network Leveraging Context Guidance and Deep Residual Learning[J].Electronics.2024,13(1):43.
[16]PAN X,CHEN Q B,HUANG A,et al.Small target detection algorithm for UAV aerial images based on improved YOLOX [J].Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2024,44(1):90-100.
[17]GONG Y,YU X,DING Y,et al.Effective Fusion Factor in FPN for Tiny Object Detection[C]//Workshop on Applications of Computer Vision.IEEE,2021.
[18]LI Y,FAN Q,HUANG H,et al.A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J].Drones,2023,7(5):304.
[19]WU T,TANG S,ZHANG R,et al.CGNet:A Light-WeightContext Guided Network for Semantic Segmentation[J].IEEE Transactions on Image Processing,2021,30:1169-1179.
[20]OUYANG D,HE S,ZHANG G Z,et al.Efficient Multi-Scale Attention Module with Cross-Spatial Learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2023).Rhodes Island,Greece,2023:1-5.
[21]TAN M X,PANG R M,LE Q V.EfficientDet:Scalable and Efficient Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:10781-10790.
[22]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475.
[23]ZHANG H,XU C,ZHANG S.Inner-IoU:More effective intersection over union loss with auxiliary bounding box [J].arXiv:2311.02877,2024.
[24]DU D W,ZHU P F,WEN L Y,et al.VisDrone-DET2019:The Vision Meets Drone Object Detection in Image Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).Seoul:IEEE,2019:213-226.
[25]YAO P,SONKA M,CHEN D Z.U-Net v2:Rethinking the Skip Connections of U-Net for Medical Image Segmentation[J].ar-Xiv:2311.17791,2023.
[26]SUNKARA R,LUO T.No More Strided Convolutions or Pooling:A New CNN Building Block for Low-Resolution Images and Small Objects[J].Machine Learning and Knowledge Discovery in Databases,2023,13715:27.
[27]XU G P,LIAO W T,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819.
[28]WANG C Y,YEH I H,LIAO H Y M.YOLOv9:Learning What You Want to Learn Using Programmable Gradient Information[J].arXiv:2402.13616,2024.
[29]LU W,CHEN S B,TANG J,et al.A Robust Feature Downsampling Module for Remote-Sensing Visual Tasks[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-12.
[30]WANG C C,HE W,NIE Y,et al.Gold-yolo:Efficient object detector via gather-and-distribute mechanism[J].arXiv:2309.11331,2023.
[31]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6569-6578.
[32]CAI Z,VASCONCELOS N.Cascade R-CNN:Delving into highquality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6154-6162.
[33]LV W,XU S,ZHAO Y,et al.Detrs beat yolos on real-time object detection[J].arXiv:2304.08069,2023.

Related Articles 15

[1]	JIANG Wenwen, XIA Ying. Improved U-Net Multi-scale Feature Fusion Semantic Segmentation Network for RemoteSensing Images [J]. Computer Science, 2025, 52(5): 212-219.
[2]	HE Chunhui, GE Bin, ZHANG Chong, XU Hao. Intelligent Error Correction Model for Chinese Idioms Fused with Fixed-length Seq2Seq Network [J]. Computer Science, 2025, 52(5): 227-234.
[3]	HAN Daojun, LI Yunsong, ZHANG Juntao, WANG Zemin. Knowledge Graph Completion Method Fusing Entity Descriptions and Topological Structure [J]. Computer Science, 2025, 52(5): 260-269.
[4]	PENG Linna, ZHANG Hongyun, MIAO Duoqian. Complex Organ Segmentation Based on Edge Constraints and Enhanced Swin Unetr [J]. Computer Science, 2025, 52(4): 177-184.
[5]	KONG Jialin, ZHANG Qi, WEI Jianze, LI Qi. Adaptive Contextual Learning Method Based on Iris Texture Perception [J]. Computer Science, 2025, 52(4): 185-193.
[6]	ZHENG Xiubao, LI Jing, ZHU Ming, NING Yingying. Selection Method for Cloud Manufacturing Industrial Services Based on Generative AdversarialNetworks [J]. Computer Science, 2025, 52(4): 54-63.
[7]	WANG Xingbo, ZHANG Hao, GAO Hao, ZHAI Mingliang, XIE Jiucheng. Talking Portrait Synthesis Method Based on Regional Saliency and Spatial Feature Extraction [J]. Computer Science, 2025, 52(3): 58-67.
[8]	ZHONG Yue, GU Jieming. 3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss [J]. Computer Science, 2025, 52(3): 77-85.
[9]	CHENG Qinghua, JIAN Haifang, ZHENG Shuaikang, GUO Huimin, LI Yuehao. Illumination-aware Infrared/Visible Fusion for Object Detection [J]. Computer Science, 2025, 52(2): 173-182.
[10]	LIU Yanlun, XIAO Zheng, NIE Zhenyu, LE Yuquan, LI Kenli. Case Element Association with Evidence Extraction for Adjudication Assistance [J]. Computer Science, 2025, 52(2): 222-230.
[11]	ZHAO Qian, GUO Bin, LIU Yubo, SUN Zhuo, WANG Hao, CHEN Mengqi. Generation of Enrich Semantic Video Dialogue Based on Hierarchical Visual Attention [J]. Computer Science, 2025, 52(1): 315-322.
[12]	LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao. Re-parameterization Enhanced Dual-modal Realtime Object Detection Model [J]. Computer Science, 2024, 51(9): 162-172.
[13]	HU Pengfei, WANG Youguo, ZHAI Qiqing, YAN Jun, BAI Quan. Night Vehicle Detection Algorithm Based on YOLOv5s and Bistable Stochastic Resonance [J]. Computer Science, 2024, 51(9): 173-181.
[14]	LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264.
[15]	LI Zhe, LIU Yiyang, WANG Ke, YANG Jie, LI Yafei, XU Mingliang. Real-time Prediction Model of Carrier Aircraft Landing Trajectory Based on Stagewise Autoencoders and Attention Mechanism [J]. Computer Science, 2024, 51(9): 273-282.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

An Improved YOLOv8 Object Detection Algorithm for UAV Aerial Images

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0