计算机科学 ›› 2025, Vol. 52 ›› Issue (3): 104-111.doi: 10.11896/jsjkx.240700041

• 三维视觉与元宇宙 • 上一篇    下一篇

融合动态加权图卷积的三维目标检测

李宗民, 戎光彩, 白云, 徐畅, 鲜世洋   

  1. 中国石油大学(华东)青岛软件学院、计算机科学与技术学院 山东 青岛 266580
  • 收稿日期:2024-07-08 修回日期:2024-09-09 出版日期:2025-03-15 发布日期:2025-03-07
  • 通讯作者: 李宗民(lizongmin@upc.edu.cn)
  • 基金资助:
    国家重点研发计划(2019YFF0301800);国家自然科学基金(61379106);山东省自然科学基金(ZR2013FM036,ZR2015FM011)

3D Object Detection with Dynamic Weight Graph Convolution

LI Zongmin, RONG Guangcai, BAI Yun, XU Chang , XIAN Shiyang   

  1. Qingdao Institute of Software,College of Computer Science and Technology,China University of Petroleum(East China),Qingdao,Shandong 266580,China
  • Received:2024-07-08 Revised:2024-09-09 Online:2025-03-15 Published:2025-03-07
  • About author:LI Zongmin,born in 1965,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.11175S).His main research interests include computer graphics,digital image processing and pattern recognition.
  • Supported by:
    National Key Research and Development Program of China(2019YFF0301800),National Natural Science Foundation of China(61379106) and Natural Science Foundation of Shandong Province,China(ZR2013FM036,ZR2015FM011).

摘要: 三维目标检测是自动驾驶中最关键的技术之一,基于激光雷达的三维目标检测通常在点云构建的场景中进行。目前的三维检测方法不能充分地利用点云的结构信息,这将导致目标物体的误检和漏检。为此,提出了基于动态加权图卷积的DEG R-CNN。首先,在RoI中对节点设置主邻点和次邻点,为目标物体构建点云的图结构,恢复物体的几何信息;然后,在图中利用Gaussian函数和一维卷积,高效地聚合点云的结构特征;最后,使用交叉注意力机制自适应地融合不同粒度的图像特征,为点云补充图像语义信息。在KITTI数据集上进行实验,验证了各个模块的有效性,三维目标检测的3D mAP达到88.80%,相比基线模型提高了1.22%。同时,对三维目标检测的结果进行了可视化,并对可视化结果进行了分析。

关键词: 点云, 三维目标检测, 激光雷达, 多模态融合, 自动驾驶

Abstract: 3D object detection is one of the most critical technologies in autonomous driving,and 3D object detection based on LiDAR is usually carried out in the scene of point cloud construction.The current methods cannot fully use the point cloud’s structural information,leading to false and missed target detection.To solve this problem,we propose a DEG R-CNN based on dyna-mically weighted graph convolution.Firstly,the primary neighbour and subordinate neighbour are set for the node in RoI,and the graph structure of the point cloud is constructed.The geometric information of the object is restored.Then,Gaussian and 1D convolution are used in the graph to efficiently aggregate the point cloud’s structural features.Finally,the cross-attention mechanism adaptively fuses image features of different granularities to supplement the image semantic information.Experiments are conducted on KITTI dataset,and the effectiveness of modules is verified.The 3D mAP of the method reaches 88.80%,which is 1.22% higher than that of the baseline model.At the same time,the results of 3D object detection are visualized and analyzed in detail to understand performance and accuracy of the method better.

Key words: Point clouds, 3D object detection, LiDAR, Multimodal fusion, Automatic driving

中图分类号: 

  • TP391
[1]SHI S S,WANG X G,LI H S.PointRCNN:3D object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:770-779.
[2]YANG Z T,SUN Y N,LIU S,et al.3DSSD:Point-based 3D single stage object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:11040-11048.
[3]SHI S S,GUO C X,JIANG L,et al.PV-RCNN:Point-voxel feature set abstraction for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10529-10538.
[4]DENG J J,SHI S S,LI P W,et al.Voxel R-CNN:Towards high performance voxel-based 3D object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:1201-1209.
[5]SHI W,RAJKUMAR R.Point-GNN:Graph neural network for 3D object detection in a point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1711-1719.
[6]YANG H L,LIU Z L,WU X P,et al.Graph R-CNN:Towards accurate 3D object detection with semantic-decorated local graph[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:662-679.
[7]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:652-660.
[8]QI C R,YI L,SU H,et al.PointNet++:Deep hierarchical feature learning on point sets in a metric space[J].arXiv:1706.02413,2017.
[9]ZHOU Y,TUZEL O.Voxelnet:End-to-end learning for pointcloud based 3d object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4490-4499.
[10]YAN Y,MAO Y X,LI B.SECOND:Sparsely embedded convolutional detection[J].Sensors,2018,18(10):3337.
[11]LANG A H,VORA S,CAESAR H,et al.PointPillars:Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12697-12705.
[12]VORA S,LANG A H,HELOU B,et al.PointPainting:Sequential fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:4604-4612.
[13]WANG C W,MA C,ZHU M,et al.PointAugmenting:Cross-modal augmentation for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:11794-11803.
[14]LI Y,YU A W,MENG T,et al.DeepFusion:Lidar-camera deep fusion for multi-modal 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:17182-17191.
[15]CHEN Z,LI Z,ZHANG S,et al.AutoAlign:Pixel-instance feature aggregation for multi-modal 3D object detection[J].arXiv:2201.06493,2022.
[16]CHEN Z,LI Z,ZHANG S,et al.Deformable feature aggregation for dynamic multi-modal 3D object detection[C]//European Conference on Computer Vision.Cham:Springer Nature Swit-zerland,2022:628-644.
[17]LI X,MA T,HOU Y N,et al.LoGoNet:Towards accurate 3D object detection with local-to-global cross-modal fusion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:17524-17534.
[18]ZHOU X Y,WANG D Q,KRÄHENBÜHL P.Objects as points[J].arXiv:1904.07850,2019.
[19]WANG Y,SUN Y B,LIU Z W,et al.Dynamic graph cnn for learning on point clouds[J].ACM Transactions on Graphics(tog),2019,38(5):1-12.
[20]KU J,MOZIFIAN M,LEE J,et al.Joint 3D proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2018:1-8.
[21]LIANG M,YANG B,CHEN Y,et al.Multi-task multi-sensorfusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7345-7353.
[22]CHEN X Z,MA H M,WAN J,et al.Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1907-1915.
[23]GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:The kitti dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[24]HE C H,ZENG H,HUANG J Q,et al.Structure aware single-stage 3D object detection from point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:11873-11882.
[25]ZHOU C,ZHANG Y N,CHEN J X,et al.OcTr:Octree-based transformer for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:5166-5175.
[26]YOO J H,KIM Y,KIM J,et al.3D-CVF:Generating joint camera and lidar features using cross-view spatial feature fusion for 3D object detection[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:720-736.
[27]PANG S,MORRIS D,RADHA H.CLOCs:Camera-LiDAR object candidates fusion for 3D object detection[C]//2020 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2020:10386-10393.
[28]HUANG T T,LIU Z,CHEN X W,et al.EPNet:Enhancing point features with image semantics for 3D object detection[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:35-52.
[29]CHEN Y K,LI Y W,ZHANG X Y,et al.Focal sparse convolutional networks for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:5428-5437.
[30]LI Y W,QI X J,CHEN Y K,et al.Voxel field fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:1120-1129.
[31]MAHMOUD A,HU J S K,WASLANDER S L.Dense voxel fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2023:663-672.
[32]WANG M L,ZHAO L,YUE Y F.PA3DNet:3-D vehicle detection with pseudo shape segmentation and adaptive camera-LiDAR fusion[J].IEEE Transactions on Industrial Informatics,2023,19(11):10693-10703.
[33]WANG C H,CHEN H W,CHEN Y,et al.VoPiFNet:Voxel-Pixel Fusion Network for Multi-Class 3D Object Detection[J].IEEE Transactions on Intelligent Transportation Systems,2024,25(8):8527-8537.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!