计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 250200051-7.doi: 10.11896/jsjkx.250200051
纪涛1,2,3, 杨一帆1,2, 冯亚春2, 伍凌帆2, 李旭亮2, 李亚伟2
JI Tao1,2,3, YANG Yifang1,2, FENG Yachun2, WU Lingfan2, LI Xuliang2, LI Yawei2
摘要: 在无人驾驶场景中,目标检测的准确性和鲁棒性对系统性能至关重要。针对现有基于深度学习的网络模型在无人驾驶场景处理小目标和遮挡目标问题时出现的误检和漏检现象,提出了一种LSDA-YOLO网络模型。首先,提出了LocalSimAM(Local Simple and Effective Attention Mechanism)注意力机制,用于改善信息丢失问题,并将其应用于Backbone;同时引入SHSA(Single-Head Self-Attention)注意力机制,设计了一个信息聚合网络,提升对遮挡目标的检测能力。在Neck部分,通过动态调整上采样比例,增强模型对多尺度特征的适应性,减少小目标漏检率。在Head部分引入了自适应空间多尺度特征融合(Adaptive Spatial Feature Fusion,ASFF)策略,增强模型的多尺度检测能力。实验结果表明,LSDA-YOLO网络模型在KITTI数据集上,mAP0.5和mAP0.5:0.95分别提升了3.1个百分点和3.9个百分点,优于YOLOv11n基准网络模型,适用于无人驾驶场景高精度实时检测。
中图分类号:
| [1]GIRSHICK R.Fast R-CNN[J].arXiv:1504.08083,2015. [2]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149. [3]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands.Springer International Publishing,2016:21-37. [4]REDMON J.You only look once:Unified,real-time object detec-tion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016. [5]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125. [6]LIM J S,ASTRID M,YOONH J,et al.Small object detection using context and attention[C]//2021 International Conference on Artificial Intelligence in Information and Communication(ICAIIC).IEEE,2021:181-186. [7]BAI Y,ZHANG Y,DING M,et al.Sod-mtgan:Small object detection via multi-task generative adversarial network[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:206-221. [8]KISANTAL M.Augmentation for Small Object Detection[J].arXiv:1902.07296,2019. [9]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:510-519. [10]JU M R,LUO H B,WANG Z B,et al.Improved YOLO V3 algorithm and its application in small object detection [J].Acta Optica Sinica,2019,39(7):0715004. [11]CHEN F,GAO C,LIU F,et al.Local patch network with globalattention for infrared small object detection[J].IEEE Transactions on Aerospace and Electronic Systems,2022,58(5):3979-3991. [12]LI G,FAN W,XIE H,et al.Detection of road objects based on camera sensors for autonomous driving in various traffic situations[J].IEEE Sensors Journal,2022,22(24):24253-24263. [13]YANG L,ZHANG R Y,LI L,et al.Simam:A simple,parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2021:11863-11874. [14]ZHANG H,ZU K,LU J,et al.Epsanet:An efficient pyramidsplit attention block on convolutional neural network[C]//CoRR.2021. [15]YUN S,RO Y.Shvit:Single-head vision transformer with memory efficient macro design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:5756-5767. [16]LIU W,LU H,FU H,et al.Learning to upsample by learning to sample[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:6027-6037. [17]WANG J,CHEN K,XU R,et al.Carafe:Content-aware reassembly of features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3007-3016. [18]LU H,LIU W,FU H,et al.FADE:A Task-Agnostic Upsampling Operator for Encoder-Decoder Architectures[J].arXiv:2407.13500,2024. [19]LU H,LIU W,YE Z,et al.SAPA:Similarity-aware point affiliation for feature upsampling[J].Advances in Neural Information Processing Systems,2022,35:20889-20901. [20]GEIGER A,LENZ P,STILLER C,et al.KITTI Vision Benchmark Suite[EB/OL].https://www.cvlibs.net/datasets/kitti. [21]REN S,HE K,GIRSHICKR,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149. [22]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands.Springer International Publishing,2016:21-37. [23]CARION N,MASSA F,SYNNAEVEG,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:213-229. [24]Ultralytics.Ultralytics/yolov5.GitHub[DB/OL].https://git-hub.com/ultralytics/yolov5. [25]LI C,LI L,JIANGH,et al.YOLOv6:A single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022. [26]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475. [27]ULTRALYTICS.Ultralytics/yolov8[DB/OL].https://github.com/ultralytics/yolov8. [28]WANG A,CHEN H,LIU L,et al.Yolov10:Real-time end-to-end object detection[J].Advances in Neural Information Processing Systems,2024,37:107984-108011. |
|
||