计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 202-211.doi: 10.11896/jsjkx.240500042
胡惠娟1, 秦一锋1, 徐鹤1,2, 李鹏1,2
HU Huijuan1, QIN Yifeng1, XU He1,2and LI Peng1,2
摘要: 针对无人机视角下航拍图像目标检测中存在的目标尺度变化多样、背景复杂、小目标聚集以及无人机平台计算资源受限等问题,提出了一种改进YOLOv8目标检测算法YOLOv8-CEBI。首先,在骨干网络引入轻量级Context Guided 模块,显著降低模型参数量与计算量,同时引入多尺度注意力机制EMA,捕获细粒度空间信息,提升对小目标和在复杂背景下的检测能力。其次,引入加权双向特征金字塔网络BiFPN,对颈部进行改造,在保持参数成本的前提下,增强多尺度特征融合能力。最后利用Inner-CIOU损失函数生成辅助边框以更精准地计算损失并加速边界框回归过程。在VisDrone数据集上进行实验,结果表明,与原始YOLOv8s算法相比,改进方法参数量减少51.3%,运算量减少28.5%,mAP50提升1.6%。所提模型在轻量化的同时提升了精度,取得了在减少计算资源与保证精度之间的平衡。
中图分类号:
[1]LENG J X,MO M C,ZHOU Y H,et al.Research progress on target detection from the perspective of UAV [J].Chinese Journal of Image and Graphics,2023,28(9):2563-2586. [2]DUAN Z J,LI S B,HU J J,et al.Review of Deep LearningBased Object Detection Methods and Their Mainstream Frameworks[J].Laser & Optoelectronics Progress,2020(12):59-74. [3]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Columbus:IEEE,2014:580-587. [4]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448. [5]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:To-wards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [6]HE K M,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[J].arXiv:1703.06870,2017. [7]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of the 14th European Conference on Computer Vision(ECCV).Amsterdam:Springer,2016:21-37. [8]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition(CVPR).Las Vegas:IEEE,2016:779-788. [9]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2017:6517-6525. [10]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018. [11]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal Speed and Accuracy of Object Detection[J].arXiv:2004.10934,2020. [12]ZHU X,LYU S,WANG X,et al.TPH-YOLOv5:ImprovedYOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios[J].arXiv:2108.11539,2021. [13]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).Venice:IEEE,2017:2980-2988. [14]WANG J,ZHANG F,ZHANG Y,et al.Lightweight Object Detection Algorithm for UAV Aerial Imagery[J].Sensors,2023,23:5786. [15]NIU Y,CHENG W,SHI C,et al.YOLOv8-CGRNet:A Lightweight Object Detection Network Leveraging Context Guidance and Deep Residual Learning[J].Electronics.2024,13(1):43. [16]PAN X,CHEN Q B,HUANG A,et al.Small target detection algorithm for UAV aerial images based on improved YOLOX [J].Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2024,44(1):90-100. [17]GONG Y,YU X,DING Y,et al.Effective Fusion Factor in FPN for Tiny Object Detection[C]//Workshop on Applications of Computer Vision.IEEE,2021. [18]LI Y,FAN Q,HUANG H,et al.A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J].Drones,2023,7(5):304. [19]WU T,TANG S,ZHANG R,et al.CGNet:A Light-WeightContext Guided Network for Semantic Segmentation[J].IEEE Transactions on Image Processing,2021,30:1169-1179. [20]OUYANG D,HE S,ZHANG G Z,et al.Efficient Multi-Scale Attention Module with Cross-Spatial Learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2023).Rhodes Island,Greece,2023:1-5. [21]TAN M X,PANG R M,LE Q V.EfficientDet:Scalable and Efficient Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:10781-10790. [22]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475. [23]ZHANG H,XU C,ZHANG S.Inner-IoU:More effective intersection over union loss with auxiliary bounding box [J].arXiv:2311.02877,2024. [24]DU D W,ZHU P F,WEN L Y,et al.VisDrone-DET2019:The Vision Meets Drone Object Detection in Image Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).Seoul:IEEE,2019:213-226. [25]YAO P,SONKA M,CHEN D Z.U-Net v2:Rethinking the Skip Connections of U-Net for Medical Image Segmentation[J].ar-Xiv:2311.17791,2023. [26]SUNKARA R,LUO T.No More Strided Convolutions or Pooling:A New CNN Building Block for Low-Resolution Images and Small Objects[J].Machine Learning and Knowledge Discovery in Databases,2023,13715:27. [27]XU G P,LIAO W T,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819. [28]WANG C Y,YEH I H,LIAO H Y M.YOLOv9:Learning What You Want to Learn Using Programmable Gradient Information[J].arXiv:2402.13616,2024. [29]LU W,CHEN S B,TANG J,et al.A Robust Feature Downsampling Module for Remote-Sensing Visual Tasks[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-12. [30]WANG C C,HE W,NIE Y,et al.Gold-yolo:Efficient object detector via gather-and-distribute mechanism[J].arXiv:2309.11331,2023. [31]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6569-6578. [32]CAI Z,VASCONCELOS N.Cascade R-CNN:Delving into highquality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6154-6162. [33]LV W,XU S,ZHAO Y,et al.Detrs beat yolos on real-time object detection[J].arXiv:2304.08069,2023. |
|