面向自动驾驶的三维目标检测综述

doi:10.11896/jsjkx.220700090

摘要/Abstract

摘要： 近年来,随着自动驾驶行业的蓬勃发展,作为感知系统核心的三维目标检测技术受到越来越多的关注,已成为当前热门的研究方向。同时,深度学习的广泛应用,使得最近的三维目标检测技术有了很大的突破,大批优秀的算法涌现。文中系统地总结了面向自动驾驶领域的三维目标检测方法,并按传感器类型将现有的算法分为3类,即基于图像的三维目标检测、基于LiDAR的三维目标检测和基于多传感器的三维目标检测;其次,详细分析了3种方法的优缺点,并对基于LiDAR的三维目标检测算法进行了深入调研和细分;然后,介绍了自动驾驶领域常用的三维目标检测数据集,包括KITTI,nuScenes和Waymo Open Dataset,并对比了最新的三维目标检测算法在不同数据集上的性能表现;最后探讨了三维目标检测技术未来的发展方向。

关键词: 自动驾驶, 三维目标检测, 深度学习, 点云, 激光雷达

Abstract: In recent years,with the rapid development of autonomous driving,3D object detection technology as the core of perception systems has received more and more attention and become a hot research direction.At the same time,the wide application of deep learning has made a great breakthrough in 3D object detection technology recently.A large number of excellent algorithms have emerged.This paper systematically summarizes 3D object detection methods for the autonomous driving field and divides the existing algorithms into three categories according to sensor types:image-based 3D object detection,LiDAR-based 3D object detection,and multi-sensor-based 3D object detection.After that,it analyzes the advantages and disadvantages of the three methods in detail.The LiDAR-based 3D object detection algorithms are thoroughly investigated and subdivided.Then it introduces the commonly used 3D object detection datasets in autonomous driving,including KITTI,nuScenes,and Waymo Open Dataset,and compares the performance of the latest 3D object detection algorithms on different datasets.Finally,the future research direction of 3D object detection technology is discussed.

Key words: Autonomous driving, 3D object detection, Deep learning, Point cloud, LiDAR

中图分类号:

TP391.41

霍威乐, 荆涛, 任爽. 面向自动驾驶的三维目标检测综述[J]. 计算机科学, 2023, 50(7): 107-118. https://doi.org/10.11896/jsjkx.220700090

HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving[J]. Computer Science, 2023, 50(7): 107-118. https://doi.org/10.11896/jsjkx.220700090

参考文献

[1]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Santiago:IEEE,2015:1440-1448.
[2]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only LookOnce:Unified,Real-time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:779-788.
[3]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultibox Detector[C]//Proceedings of the European Confe-rence on Computer Vision.Amsterdam:Springer,2016:21-37.
[4]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587.
[5]ZHANG P,SONG Y F,ZONG L B,et al.Advances in 3D Object Detection:a Brief survey[J].Computer Science,2020,47(4):94-102.
[6]SONG S,XIAO J.Sliding Shapes for 3D Object Detection inDepth Images[C]//Proceedings of the European Conference on Computer Vision.Zurich:Springer,2014:634-651.
[7]CHEN X,KUNDU K,ZHU Y,et al.3D Object Proposals for Accurate Object Class Detection[J].Advances in Neural Information Processing Systems,2015,28:424-432.
[8]CHEN X,KUNDU K,ZHANG Z,et al.Monocular 3D ObjectDetection for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:2147-2156.
[9]SONG S,XIAO J.Deep Sliding Shapes for Amodal 3D ObjectDetection in RGB-D Images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:808-816.
[10]CHEN X,KUNDU K,ZHU Y,et al.3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(5):1259-1272.
[11]WANG D Z,POSNER I.Voting for Voting in Online PointCloud Object Detection[C]//Robotics:Science and Systems.2015:10-15.
[12]ENGELCKE M,RAO D,WANG D Z,et al.Vote3deep:FastObject Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks[C]//Proceedings of the IEEE International Conference on Robotics and Automation.Singapore:IEEE,2017:1355-1361.
[13]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:4490-4499.
[14]YAN Y,MAO Y,LI B.SECOND:Sparsely Embedded Convolutional Detection[J].Sensors,2018,18(10):3337-3353.
[15]LI B.3D Fully Convolutional Network for Vehicle Detection in Point Cloud[C]//Proceedings of the IEEE International Confe-rence on Intelligent Robots and Systems.Vancouver:IEEE,2017:1513-1518.
[16]KUANG H,WANG B,AN J,et al.Voxel-FPN:Multi-scaleVoxel Feature Aggregation for 3D Object Detection from LiDAR Point Clouds[J].Sensors,2020,20(3):704.
[17]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:652-660.
[18]QI C R,YI L,SU H,et al.PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[J].Advances in Neural Information Processing Systems,2017,30:5099-5108.
[19]SHI S,WANG X,LI H.PointRCNN:3D Object Proposal Ge-neration and Detection from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:770-779.
[20]ZHAO J H,DOU X T,CAO Y E,et al.A Method for 3D Point Cloud Classification Based on Segmentation Results[J].Science of Surveying and Mapping,2022,47(3):85-95.
[21]DU Z J,CAO F L,YE H L,et al.3D Point Cloud Classification Algorithm Based on Residual Edge Convolution[J].Pattern Recognition and Artificial Intelligence,2021,34(9):836-843.
[22]YANG X W,WANG A B,HAN X,et al.Point Cloud Semantic Segmentation Based on KNN-PointNet[J].Laser & Optoelectronics Progress,2021,58(24):272-279.
[23]AOKI Y,GOFORTH H,SRIVATSAN R A,et al.PointNetlk:Robust & Efficient Point Cloud Registration Using PointNet[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:7163-7172.
[24]GE L,REN Z,YUAN J.Point-to-Point Regression PointNet for 3D Hand Pose Estimation[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer,2018:475-491.
[25]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:918-927.
[26]YANG Z,SUN Y,LIU S,et al.IPOD:Intensive Point-based Object Detector for Point Cloud[J].arXiv:1812.05276,2018.
[27]YANG Z,SUN Y,LIU S,et al.STD:Sparse-to-Dense 3D Object Detector for Point Cloud[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:1951-1960.
[28]QI C R,LITANY O,HE K,et al.Deep Hough Voting for 3D Object Detection in Point Clouds[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:9277-9286.
[29]SHI S,GUO C,JIANG L,et al.PV-RCNN:Point-Voxel Feature Set Abstraction for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:10529-10538.
[30]YANG B,GUO R,LIANG M,et al.RadarNet:Exploiting Radar for Robust Perception of Dynamic Objects[C]//Proceedings of the European Conference on Computer Vision.Springer,2020:496-512.
[31]NABATI R,QI H.CenterFusion:Center-based Radar and Ca-mera Fusion for 3D Object Detection[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision.IEEE,2021,1526-1535.
[32]XU B,ZHANG X,WANG L,et al.RPFA-Net:a 4D RaDAR Pillar Feature Attention Network for 3D Object Detection[C]//Proceedings of the IEEE International Intelligent Transportation Systems Conference.Indianapolis:IEEE,2021:3061-3066.
[33]LI X L.Research on Key Technologies of Object Detection for Vehicular 4D Millimeter Wave Radar[D].Nanjing:Nanjing University Of Science And Technology,2021.
[34]MEYER M,KUSCHK G.Automotive Radar Dataset for Deep Learning Based 3D Object Detection[C]//Proceedings of the European Radar Conference.Paris:IEEE,2019:129-132.
[35]QIAN R,LAI X,LI X.3D Object Detection for AutonomousDriving:a Survey[J].Pattern Recognition,2022,130:108796.
[36]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].Advances in Neural Information Processing Systems,2015,28:91-99.
[37]SONG S,CHANDRAKER M.Joint SFM and Detection Cues for Monocular 3D Localization in Road Scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:3734-3742.
[38]WANG Y,CHAO W L,GARG D,et al.Pseudo-LiDAR from Visual Depth Estimation:Bridging the Gap in 3D Object Detection for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:8445-8453.
[39]YOU Y,WANG Y,CHAO W L,et al.Pseudo-LiDAR++:Accurate Depth for 3D Object Detection in Autonomous Driving[J].arXiv:1906.06310,2019.
[40]MA X,WANG Z,LI H,et al.Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:6851-6860.
[41]QIAN R,GARG D,WANG Y,et al.End-to-End Pseudo-LiDAR for Image-based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:5881-5890.
[42]YANG B,LUO W,URTASUN R.PIXOR:Real-Time 3D Object Detection from Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:7652-7660.
[43]SIMON M,MILZ S,AMENDE K,et al.Complex-YOLO:AnEuler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds[C]//Proceedings of the European Conference on Computer Vision Workshops.Munich:Springer,2018:197-209.
[44]ALI W,ABDELKARIM S,ZIDAN M,et al.Yolo3D:End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//Proceedings of the European Conference on Computer Vision Workshops.Munich:Springer,2018:716-728.
[45]LANG A H,VORA S,CAESAR H,et al.PointPillars:Fast Encoders for Object Detection from Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:12697-12705.
[46]LUO W,YANG B,URTASUN R.Fast and Furious:Real Time End-to-End 3D Detection,Tracking and Motion Forecasting with a Single Convolutional Net[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:3569-3577.
[47]SHI S,WANG Z,SHI J,et al.From Points to Parts:3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(8):2647-2664.
[48]DENG J,SHI S,LI P,et al.Voxel R-CNN:Towards High Performance Voxel-based 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2021:1201-1209.
[49]HE C,ZENG H,HUANG J,et al.Structure Aware Single-stage 3D Object Detection from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11873-11882.
[50]YE M,XU S,CAO T.HVNet:Hybrid Voxel Network for LiDAR based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1631-1640.
[51]DU L,YE X,TAN X,et al.Associate-3Ddet:Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13329-13338.
[52]ZHENG W,TANG W,CHEN S,et al.CIA-SSD:Confident IoU-aware Single-Stage Object Detector from Point Cloud[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2021:3555-3562.
[53]ZHENG W,TANG W,JIANG L,et al.SE-SSD:Self-Ensembling Single-Stage Object Detector from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2021:14494-14503.
[54]MAO J,XUE Y,NIU M,et al.Voxel Transformer for 3D Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Montreal:IEEE,2021:3164-3173.
[55]YANG Z,SUN Y,LIU S,et al.3DSSD:Point-based 3D Single Stage Object Detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11040-11048.
[56]SHI W,RAJKUMAR R.Point-GNN:Graph Neural Networkfor 3D Object Detection in a Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1711-1719.
[57]ZARZAR J,GIANCOLA S,GHANEM B.PointRGCN:Graph Convolution Networks for 3D Vehicles Detection Refinement[J].arXiv:1911.12236,2019.
[58]CHEN Y,LIU S,SHEN X,et al.Fast Point R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:9775-9784.
[59]SHI S,JIANG L,DENG J,et al.PV-RCNN++:Point-Voxel Feature Set Abstraction with Local Vector Representation for 3D Object Detection[J].arXiv:2102.00463,2021.
[60]LI J,SUN Y,LUO S,et al.P2V-RCNN:Point to Voxel Feature Learning for 3D Object Detection from Point Clouds[J].IEEE Access,2021,9:98249-98260.
[61]LI J,DAI H,SHAO L,et al.From Voxel to Point:IoU-Guided 3D Object Detection for Point Cloud with Voxel-to-Point Deco-der[C]//Proceedings of the 29th ACM International Conference on Multimedia.Chengdu:ACM,2021:4622-4631.
[62]SONG N,JIANG T,YAO J.JPV-Net:Joint Point-Voxel Representations for Accurate 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2271-2279.
[63]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:7263-7271.
[64]SHEN Q,CHEN Y L,LIU S,et al.3D Object Detection Algorithm based on Two-Stage Network[J].Computer Science,2020,47(10):145-150.
[65]GRAHAM B.Spatially-Sparse Convolutional Neural Networks[J].arXiv:1409.6070,2014.
[66]GRAHAM B.Sparse 3D Convolutional Neural Networks[J].arXiv:1505.02890,2015.
[67]GRAHAM B,VAN DER MAATEN L.Submanifold SparseConvolutional Networks[J].arXiv:1706.01307,2017.
[68]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[69]HINTON G,VINYALS O,DEAN J,et al.Distilling the Know-ledge in a Neural Network[J].arXiv:1503.02531,2015.
[70]YIN T,ZHOU X,KRAHENBUHL P.Center-based 3D Object Detection and Tracking[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.IEEE,2021:11784-11793.
[71]ZHOU X,WANG D,KRÄHENBÜHL P.Objects as Points[J].arXiv:1904.07850,2019.
[72]CHEN C,CHEN Z,ZHANG J,et al.SASA:Semantics-Augmented Set Abstraction for Point-based 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:11873-11882.
[73]ZHANG Y,HU Q,XU G,et al.Not All Points Are Equal:Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Orleans:IEEE,2022:18953-18962.
[74]SCARSELLI F,GORI M,TSOI A C,et al.The Graph Neural Network Model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80.
[75]KIPF T N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[J].arXiv:1609.02907,2016.
[76]BHATTACHARYYA P,CZARNECKI K.Deformable PV-RCNN:Improving 3D Object Detection with Learned Deformations[J].arXiv:2008.08766,2020.
[77]ZHAO L,HU J,AN Y P,et al.Deep Learning Based on Semantic Segmentation for Three-Dimensional Object Detection from Point Clouds[J].Chinese Journal of Lasers,2021,48(17):177-189.
[78]CHEN X,MA H,WAN J,et al.Multi-View 3D Object Detec-tion Network for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:1907-1915.
[79]KU J,MOZIFIAN M,LEE J,et al.Joint 3D Proposal Generation and Object Detection from View Aggregation[C]//Proceedings of the IEEE International Conference on Intelligent Robots and Systems.Madrid:IEEE,2018:1-8.
[80]LIANG M,YANG B,WANG S,et al.Deep Continuous Fusion for Multi-Sensor 3D Object Detection[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer,2018:641-656.
[81]LIANG M,YANG B,CHEN Y,et al.Multi-Task Multi-Sensor Fusion for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:7345-7353.
[82]CUI H,WU J,ZHANG J,et al.3D Detection and Tracking for On-road Vehicles with a Monovision Camera and Dual Low-cost 4D mmWave Radars[C]//Proceedings of the IEEE International Intelligent Transportation Systems Conference.Indianapolis:IEEE,2021:2931-2937.
[83]WANG Z,JIA K.Frustum Convnet:Sliding Frustums to Aggregate Local Point-wise Features for Amodal 3D Object Detection[C]//Proceedings of the IEEE International Conference on Intelligent Robots and Systems.Macau:IEEE,2019:1742-1749.
[84]LIU X H.3D Object Detection Research Based on Image andPoint Cloud Fusion[D].Shanghai:Donghua University,2020.
[85]XIE L,XIANG C,YU Z,et al.PI-RCNN:An Efficient Multi-Sensor 3D Object Detector with Point-based Attentive Cont-Conv Fusion Module[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:12460-12467.
[86]VORA S,LANG A H,HELOU B,et al.PointPainting:Sequential Fusion for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:4604-4612.
[87]ZHU H,DENG J,ZHANG Y,et al.VPFNet:Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion[J].arXiv:2111.14382,2021.
[88]WANG C,MA C,ZHU M,et al.PointAugmenting:Cross-Modal Augmentation for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2021:11794-11803.
[89]PANG S,MORRIS D,RADHA H.CLOCs:Camera-LiDAR Object Candidates Fusion for 3D Object Detection[C]//Procee-dings of the IEEE International Conference on Intelligent Robots and Systems.Las Vegas:IEEE,2020:10386-10393.
[90]GEIGER A,LENZ P,URTASUN R.Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Providence:IEEE,2012:3354-3361.
[91]GEIGER A,LENZ P,STILLER C,et al.Vision Meets Robo-tics:The KITTI Dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[92]CAESAR H,BANKITI V,LANG A H,et al.nuScenes:A Multimodal Dataset for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11621-11631.
[93]SUN P,KRETZSCHMAR H,DOTIWALLA X,et al.Scalability in Perception for Autonomous Driving:Waymo Open Dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:2446-2454.
[94]PHILION J,KAR A,FIDLER S.Learning to Evaluate Perception Models Using Planner-Centric Metrics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:14055-14064.
[95]GUO Y,CAESAR H,BEIJBOM O,et al.The Efficacy of Neural Planning Metrics:a Meta-Analysis of PKL on nuScenes[J].ar-Xiv:2010.09350,2020.
[96]PAEK D,KONG S,WIJAYA,K T.K-Radar:4D Radar Object Detection Dataset and Benchmark for Autonomous Driving in Various Weather Conditions[J].arXiv:2206.08171,2022.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed