计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241100112-7.doi: 10.11896/jsjkx.241100112

• 计算机图形学&多媒体 • 上一篇    下一篇

融合注意力机制的道路场景三维目标检测算法

曹文博1, 魏明洋1, 段小勇1, 刘学渊1,2   

  1. 1 西南林业大学机械与交通学院 昆明 650224
    2 云南省高校高原山区机动车环保与安全重点实验室 昆明 650224
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 刘学渊(liuxueyuan@swfu.edu.cn)
  • 作者简介:3053920433@qq.com
  • 基金资助:
    云南省科技厅农业联合专项(202301BD070001-041)

Three-dimensional Object Detection Algorithm of Road Scene Based on Attention Mechanism

CAO Wenbo1, WEI Mingyang1, DUAN Xiaoyong1, LIU Xueyuan1,2   

  1. 1 College of Mechanical Engineering and Transportation,Southwest Forestry University,Kunming 650224,China
    2 Key Laboratory of Environmental Protection and Safety of Motor Vehicles in Plateau Mountain Areas,Kunming 650224,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    Agricultural Joint Special Project of the Department of Science and Technology of Yunnan Province(202301BD070001-014)

摘要: 随着深度学习和车载激光雷达的发展,无人驾驶汽车对检测的要求也越来越高,不仅需要准确地检测出行驶道路上的障碍物,而且在检测速度上也有较高要求。而在复杂道路场景中,也总是存在障碍物遮挡以及部分目标体积较小从而导致一些目标难以准确检测的情况。针对这种问题,提出了一种改进Pointpillars算法模型的三维目标检测方法,以实现在保证检测速度的情况下有更高的准确率。首先,通过引入多种数据增强的操作来增加数据集的多样性和量级,减少过拟合现象;然后,在点柱特征提取方面加入了注意力矩阵,根据不同的体素位置和语义信息,动态地调整每个体素的重要性,使模型能够关注对目标检测任务更加有用的特征;最后,将通道注意力机制(CA)和空间注意力机制(SA)模块依次添加在模型的主干网络中,增强了模型对有用信息的响应,抑制不重要特征对检测结果的干扰,从而提高目标特征表示力。实验结果表明,改进后的算法模型在各个类别和检测难度上的检测精度均有提升。

关键词: 激光雷达, 三维目标检测, 点云, 数据增强, 注意力机制

Abstract: With the development of deep learning and on-board LiDAR,driverless cars have increasingly high requirements for detection,which not only need to accurately detect obstacles on the road,but also have high requirements on detection speed.In the complex road scene,there are always obstacles and small volume of some targets,which make it difficult to accurately detect some targets.To solve this problem,this paper proposes an improved 3D target detection method of Pointpillars algorithm model to make it have higher accuracy while guaranteeing the detection speed.Firstly,a variety of data-enhancing operations are introduced to increase the diversity and magnitude of the dataset and reduce the overfitting phenomenon.Then,an attention matrix is added to the point column feature extraction,and the importance of each voxel is dynamically adjusted according to different voxel positions and semantic information,so that the model can focus on more useful features for target detection tasks.Finally,the channel attention mechanism(CA) and spatial attention mechanism(SA) modules are added to the backbone network of the model successively,which enhance the response of the model to useful information,suppresse the interference of unimportant features to the detection results,and thus improve the representation of target features.The experimental results show that the detection accuracy of the improved algorithm model is improved in each category and detection difficulty.

Key words: Laser radar, 3D object detection, Point cloud, Data enhancement, Attention mechanism

中图分类号: 

  • TN958
[1]LI B,ZHANG T,XIA T.Vehicle detection from 3d lidar using fully convolutional network[J].arXiv:1608.07916,2016.
[2]LIANG M,YANG B,WANG S,et al.Deep continuous fusionfor multi-sensor 3d object detection[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:641-656.
[3]QI C R,SU H,MO K,et al.Pointnet:Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:652-660.
[4]LANG A H,VORA S,CAESAR H,et al.Pointpillars:Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12697-12705.
[5]YANG B,LUO W,URTASUN R.Pixor:Real-time 3d objectdetection from pointclouds[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.2018:7652-7660.
[6]ZHAO H,JIANG L,JIA J,et al.Point transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:16259-16268.
[7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the International Conference on Neural Information Processing Systems,2017:6000-6010.
[8]WANG Y,GUIZILINI V C,ZHANG T,et al.Detr3d:3d object detection from multi-view images via3d-to-2d queries[C]//Conference on Robot Learning.PMLR,2022:180-191.
[9]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:213-229.
[10]ZUO C,FENG S J,ZHANG X Y.Computational imaging under deep learning:present,Challenges and future[J].Acta Optica Sinica,2020,40(1):0111003.
[11]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[12]LUO J H,WU J X.A review of Fine-grained image classification based on deep convolutional features [J].Acta Automatica Sinica,2017,43(8):1306-1318.
[13]QI C R,LIU W,WU C,et al.Frustum pointnets for 3d object detection from rgb-d data[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:918-927.
[14]WAN J H,CHEN N J.Multi-learning Emotion RecognitionMethod based on context awareness and attention mechanism [J].Journal of Beijing Normal University(Natural Science Edition),2021,57(5):601-605.
[15]YIN Z,ONCEL T.VoxelNet:End-to-End Learning for PointCloud Based 3D Object Detection[C]//2018 IEEE/CVFConfe-rence on Computer Vision and Pattern Recognition:[Volume 7 of 13].IEEE,2018:4490-4499.
[16]YAN Y,MAO Y,LI B.Second:Sparsely embedded convolutional detection[J].Sensors,2018,18(10):3337.
[17]KU J,MOZIFIAN M,LEE J,et al.Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2018:1-8.
[18]YUE Y C,CAI Y F,W D S.GridNet-3D:A Novel Real-Time 3D Object Detection Algorithm Based on Point Cloud[J].Chinese Journal of Electronics,2021,30(5):931-939.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!