计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 150-157.doi: 10.11896/jsjkx.241200021
夏淑芳, 尹昊楠, 瞿中
XIA Shufang, YIN Haonan, QU Zhong
摘要: 近年来深度学习算法在计算机视觉领域取得了显著的进展,但是由于复杂交通图像中存在目标尺寸小、特征信息不明显、易受干扰等问题,目标检测精度依旧不高。针对这一问题,对最先进的检测模型YOLO11进行改进,设计了多尺度特征融合模型ETF-YOLO11n(Effective Traffic Feature YOLO)。首先,设计了三重特征融合模块TFF,将主干网络提取到的不同尺寸特征信息进行有效融合;其次,设计了基于混合空洞卷积的特征加强模块HDCFE,并添加至模型的颈部网络中整合不同感受野提取到的特征,降低因为遮挡等情况对模型的干扰;最后,用提出的GeoCIoU替代CIoU,通过两个不同的惩罚项,模型能更精准地反馈检测框与真实框的匹配情况。所提出的ETF-YOLO11n在交通数据KITTI上AP达到65.6%,mAP@0.5达到90.7%,与基线模型YOLO11n相比分别提升了2.4个百分点和1.2个百分点,体现了良好的检测效果。此外,ETF-YOLO11n在COCO-Traffic数据集上AP和mAP@0.5分别达到了42.5%和59.8%;所提出的方法迁移至YOLOv8模型,在KITTI数据集上AP和mAP@0.5分别达到66.9% 和91.5%。实验结果表明,所提出的方法能显著提升模型的检测能力,且对不同模型和数据集都有较好的泛化性,在精度与参数量上达到了很好的平衡1)。
中图分类号:
| [1]CHEN H,WAN W W,MATSUSHIT A,et al.AutomaticallyPrepare Training Data for YOLO Using Robotic In-Hand Observation and Synthesis [J].IEEE Transactions on Automation Science and Engineering,2024,21(3):4876-4982. [2]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultibox Detector [C]//European Conference on Computer Vision.Springer,2016:21-27. [3]JOCHER G.YOLOv5[EB/OL].(2020-06-28) [2024-11-17].https://github.com/ultralytics/yolov5. [4]LI C Y,LI LL,JIANG H L,et al.YOLOv6:A Single-Stage Object Detection Framework for Industrial Applications[J].arXiv:2209.02976,2022. [5]ZHENG G,LIU S T,WANG F,et al.YOLOX:Exceeding YOLO Series in 2021[J].arXiv:2107.08430,2021. [6]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable Bag-of-freebies Sets New State-of-the-art for Real-time Object Detectors[C]//IEEE Conference on Computer Vision and Pattern Recognition.2023:7464-7475. [7]JOCHER G.YOLOv8[EB/OL].(2023-01-10) [2024-11-17].https://github.com/ultralytics/yolov8. [8]WANG C Y,YEH I H,LIAO H Y,et al.Yolov9:LearningWhat You Want to Learn Using Programmable Gradient Information[C]//European Conference on Computer Vision.Sprin-ger,2024:1-21. [9]JOCHER G.YOLO11[EB/OL].(2024-09-30) [2024-11-17].https://GitHub-ultralytics/ultralytics:Ultralytics YOLO11. [10]GIRSHICK R,REN S,HE K,et al.Faster R-CNN:TowardsReal-time Object Detection with Region Proposal Networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [11]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end Object Detection with Transformers[C]//European Conference on Computer Vision.Springer,2020:213-219. [12]Government of the People’s Republic of China.MotorVehicles Nationwide Reached 440 Million in the First Half of 2024 [EB/OL].(2024-07-09) [2024-11-17].https://www.gov.cn/lianbo/bumen/202407/content_6961935.htm. [13]GEIGER A,LENZ P,STILLER C,et al.Vision Meets Robo-tics:The Kitti Dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237. [14]LIN T Y,DOLLAR P,PIOTR G,et al.FeaturePyramid Networks for Object Detection [C]//IEEE Conference on Compu-ter Vision and Pattern Recognition.IEEE,2017:936-944. [15]LIU S,QI L,QIN H F,et al.Path Aggregation Network for Instance Segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:8759-8768. [16]LAN L X,CHI M Y.Remote Sensing Change Detection Based on Feature Fusion and Attention Network[J].Computer Science,2022,49(6):193-198. [17]LU H T,FANG M Y,QIU Y X,et al.An Anchor-Free Defect Detector for Complex Background Based on Pixelwise Adaptive Multiscale Feature Fusion[J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-12. [18]YU J H,JIANG Y N,WANG Z Y.et al.Unitbox:An advanced object detection network [C]//International Conference on Multimedia.ACM,2016:516-520. [19]REZATOFIGHI H,HAMID T,NATHAN G,et al.Generalized Intersection over Union:A Metric and a Loss for Bounding Box Regression[C]//IEEE International Conference on Computer Vision.IEEE,2019:658-666. [20]ZHENG Z H,WANG P,LIU W,et al.Distance-IoU loss:Faster and Better Learning for Bounding Box Regression [C]//AAAI Conference on Artificial Intelligence.AAAI,2020:12993-13000. [21]GEVORGYAN Z.SIoU Loss:More Powerful Learning forBounding Box Regression [J].arXiv:2205.12740,2022. [22]LUO X,CAI Z,SHAO B,et al.Unified-IoU:For High-Quality Object Detection [J].arXiv:2408.06636,2024. [23]HU J,SHEN L,SUN G.Squeeze-and-excitation Networks[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:7132-7141. [24]WOO S P,LEE J C,KWEON J Y.CBAM:Convolutional Block Attention Module [C]//European Conference on Computer Vision.Springer,2018:3-19. [25]REN X X,LI M,LI Z H,et al.Curiosity-driven Attention forAno-maly Road Obstacles Segmentation in Autonomous Driving [J].IEEE Transactions on Intelligent Vehicles,2022,8(3):2233-2243. [26]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need[J].arXiv:1706.03762,2017. [27]ANTHONY F,DANIEL G,KYROLLOS Y Y,et al.LookHere:Vision Transformers with Directed Attention Generalize and Extrapolate[J].arXiv:2405.13958,2024. [28]GAO L Y,QU Z,WANG S Y,et al.A Lightweight Neural Network Model of Feature Pyramid and Attention Mechanism for Traffic Object Detection[J].IEEE Transactions on Intelligent Vehicles,2024,9(2):3422-3435. [29]WANG S Y,QU Z,GAO L Y,et al.Multi-spatial Pyramid Feature and Optimizing Focal Loss Function for Object Detection[J].IEEE Transactions on Intelligent Vehicles,2023,9(1):1054-1065. [30]LIN T S,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,99:2999-3007. |
|
||