计算机科学 ›› 2023, Vol. 50 ›› Issue (7): 107-118.doi: 10.11896/jsjkx.220700090
霍威乐, 荆涛, 任爽
HUO Weile, JING Tao, REN Shuang
摘要: 近年来,随着自动驾驶行业的蓬勃发展,作为感知系统核心的三维目标检测技术受到越来越多的关注,已成为当前热门的研究方向。同时,深度学习的广泛应用,使得最近的三维目标检测技术有了很大的突破,大批优秀的算法涌现。文中系统地总结了面向自动驾驶领域的三维目标检测方法,并按传感器类型将现有的算法分为3类,即基于图像的三维目标检测、基于LiDAR的三维目标检测和基于多传感器的三维目标检测;其次,详细分析了3种方法的优缺点,并对基于LiDAR的三维目标检测算法进行了深入调研和细分;然后,介绍了自动驾驶领域常用的三维目标检测数据集,包括KITTI,nuScenes和Waymo Open Dataset,并对比了最新的三维目标检测算法在不同数据集上的性能表现;最后探讨了三维目标检测技术未来的发展方向。
中图分类号:
[1]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Santiago:IEEE,2015:1440-1448. [2]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only LookOnce:Unified,Real-time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:779-788. [3]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultibox Detector[C]//Proceedings of the European Confe-rence on Computer Vision.Amsterdam:Springer,2016:21-37. [4]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587. [5]ZHANG P,SONG Y F,ZONG L B,et al.Advances in 3D Object Detection:a Brief survey[J].Computer Science,2020,47(4):94-102. [6]SONG S,XIAO J.Sliding Shapes for 3D Object Detection inDepth Images[C]//Proceedings of the European Conference on Computer Vision.Zurich:Springer,2014:634-651. [7]CHEN X,KUNDU K,ZHU Y,et al.3D Object Proposals for Accurate Object Class Detection[J].Advances in Neural Information Processing Systems,2015,28:424-432. [8]CHEN X,KUNDU K,ZHANG Z,et al.Monocular 3D ObjectDetection for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:2147-2156. [9]SONG S,XIAO J.Deep Sliding Shapes for Amodal 3D ObjectDetection in RGB-D Images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:808-816. [10]CHEN X,KUNDU K,ZHU Y,et al.3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(5):1259-1272. [11]WANG D Z,POSNER I.Voting for Voting in Online PointCloud Object Detection[C]//Robotics:Science and Systems.2015:10-15. [12]ENGELCKE M,RAO D,WANG D Z,et al.Vote3deep:FastObject Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks[C]//Proceedings of the IEEE International Conference on Robotics and Automation.Singapore:IEEE,2017:1355-1361. [13]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:4490-4499. [14]YAN Y,MAO Y,LI B.SECOND:Sparsely Embedded Convolutional Detection[J].Sensors,2018,18(10):3337-3353. [15]LI B.3D Fully Convolutional Network for Vehicle Detection in Point Cloud[C]//Proceedings of the IEEE International Confe-rence on Intelligent Robots and Systems.Vancouver:IEEE,2017:1513-1518. [16]KUANG H,WANG B,AN J,et al.Voxel-FPN:Multi-scaleVoxel Feature Aggregation for 3D Object Detection from LiDAR Point Clouds[J].Sensors,2020,20(3):704. [17]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:652-660. [18]QI C R,YI L,SU H,et al.PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[J].Advances in Neural Information Processing Systems,2017,30:5099-5108. [19]SHI S,WANG X,LI H.PointRCNN:3D Object Proposal Ge-neration and Detection from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:770-779. [20]ZHAO J H,DOU X T,CAO Y E,et al.A Method for 3D Point Cloud Classification Based on Segmentation Results[J].Science of Surveying and Mapping,2022,47(3):85-95. [21]DU Z J,CAO F L,YE H L,et al.3D Point Cloud Classification Algorithm Based on Residual Edge Convolution[J].Pattern Recognition and Artificial Intelligence,2021,34(9):836-843. [22]YANG X W,WANG A B,HAN X,et al.Point Cloud Semantic Segmentation Based on KNN-PointNet[J].Laser & Optoelectronics Progress,2021,58(24):272-279. [23]AOKI Y,GOFORTH H,SRIVATSAN R A,et al.PointNetlk:Robust & Efficient Point Cloud Registration Using PointNet[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:7163-7172. [24]GE L,REN Z,YUAN J.Point-to-Point Regression PointNet for 3D Hand Pose Estimation[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer,2018:475-491. [25]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:918-927. [26]YANG Z,SUN Y,LIU S,et al.IPOD:Intensive Point-based Object Detector for Point Cloud[J].arXiv:1812.05276,2018. [27]YANG Z,SUN Y,LIU S,et al.STD:Sparse-to-Dense 3D Object Detector for Point Cloud[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:1951-1960. [28]QI C R,LITANY O,HE K,et al.Deep Hough Voting for 3D Object Detection in Point Clouds[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:9277-9286. [29]SHI S,GUO C,JIANG L,et al.PV-RCNN:Point-Voxel Feature Set Abstraction for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:10529-10538. [30]YANG B,GUO R,LIANG M,et al.RadarNet:Exploiting Radar for Robust Perception of Dynamic Objects[C]//Proceedings of the European Conference on Computer Vision.Springer,2020:496-512. [31]NABATI R,QI H.CenterFusion:Center-based Radar and Ca-mera Fusion for 3D Object Detection[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision.IEEE,2021,1526-1535. [32]XU B,ZHANG X,WANG L,et al.RPFA-Net:a 4D RaDAR Pillar Feature Attention Network for 3D Object Detection[C]//Proceedings of the IEEE International Intelligent Transportation Systems Conference.Indianapolis:IEEE,2021:3061-3066. [33]LI X L.Research on Key Technologies of Object Detection for Vehicular 4D Millimeter Wave Radar[D].Nanjing:Nanjing University Of Science And Technology,2021. [34]MEYER M,KUSCHK G.Automotive Radar Dataset for Deep Learning Based 3D Object Detection[C]//Proceedings of the European Radar Conference.Paris:IEEE,2019:129-132. [35]QIAN R,LAI X,LI X.3D Object Detection for AutonomousDriving:a Survey[J].Pattern Recognition,2022,130:108796. [36]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].Advances in Neural Information Processing Systems,2015,28:91-99. [37]SONG S,CHANDRAKER M.Joint SFM and Detection Cues for Monocular 3D Localization in Road Scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:3734-3742. [38]WANG Y,CHAO W L,GARG D,et al.Pseudo-LiDAR from Visual Depth Estimation:Bridging the Gap in 3D Object Detection for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:8445-8453. [39]YOU Y,WANG Y,CHAO W L,et al.Pseudo-LiDAR++:Accurate Depth for 3D Object Detection in Autonomous Driving[J].arXiv:1906.06310,2019. [40]MA X,WANG Z,LI H,et al.Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:6851-6860. [41]QIAN R,GARG D,WANG Y,et al.End-to-End Pseudo-LiDAR for Image-based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:5881-5890. [42]YANG B,LUO W,URTASUN R.PIXOR:Real-Time 3D Object Detection from Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:7652-7660. [43]SIMON M,MILZ S,AMENDE K,et al.Complex-YOLO:AnEuler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds[C]//Proceedings of the European Conference on Computer Vision Workshops.Munich:Springer,2018:197-209. [44]ALI W,ABDELKARIM S,ZIDAN M,et al.Yolo3D:End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//Proceedings of the European Conference on Computer Vision Workshops.Munich:Springer,2018:716-728. [45]LANG A H,VORA S,CAESAR H,et al.PointPillars:Fast Encoders for Object Detection from Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:12697-12705. [46]LUO W,YANG B,URTASUN R.Fast and Furious:Real Time End-to-End 3D Detection,Tracking and Motion Forecasting with a Single Convolutional Net[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:3569-3577. [47]SHI S,WANG Z,SHI J,et al.From Points to Parts:3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(8):2647-2664. [48]DENG J,SHI S,LI P,et al.Voxel R-CNN:Towards High Performance Voxel-based 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2021:1201-1209. [49]HE C,ZENG H,HUANG J,et al.Structure Aware Single-stage 3D Object Detection from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11873-11882. [50]YE M,XU S,CAO T.HVNet:Hybrid Voxel Network for LiDAR based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1631-1640. [51]DU L,YE X,TAN X,et al.Associate-3Ddet:Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13329-13338. [52]ZHENG W,TANG W,CHEN S,et al.CIA-SSD:Confident IoU-aware Single-Stage Object Detector from Point Cloud[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2021:3555-3562. [53]ZHENG W,TANG W,JIANG L,et al.SE-SSD:Self-Ensembling Single-Stage Object Detector from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2021:14494-14503. [54]MAO J,XUE Y,NIU M,et al.Voxel Transformer for 3D Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Montreal:IEEE,2021:3164-3173. [55]YANG Z,SUN Y,LIU S,et al.3DSSD:Point-based 3D Single Stage Object Detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11040-11048. [56]SHI W,RAJKUMAR R.Point-GNN:Graph Neural Networkfor 3D Object Detection in a Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1711-1719. [57]ZARZAR J,GIANCOLA S,GHANEM B.PointRGCN:Graph Convolution Networks for 3D Vehicles Detection Refinement[J].arXiv:1911.12236,2019. [58]CHEN Y,LIU S,SHEN X,et al.Fast Point R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:9775-9784. [59]SHI S,JIANG L,DENG J,et al.PV-RCNN++:Point-Voxel Feature Set Abstraction with Local Vector Representation for 3D Object Detection[J].arXiv:2102.00463,2021. [60]LI J,SUN Y,LUO S,et al.P2V-RCNN:Point to Voxel Feature Learning for 3D Object Detection from Point Clouds[J].IEEE Access,2021,9:98249-98260. [61]LI J,DAI H,SHAO L,et al.From Voxel to Point:IoU-Guided 3D Object Detection for Point Cloud with Voxel-to-Point Deco-der[C]//Proceedings of the 29th ACM International Conference on Multimedia.Chengdu:ACM,2021:4622-4631. [62]SONG N,JIANG T,YAO J.JPV-Net:Joint Point-Voxel Representations for Accurate 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2271-2279. [63]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:7263-7271. [64]SHEN Q,CHEN Y L,LIU S,et al.3D Object Detection Algorithm based on Two-Stage Network[J].Computer Science,2020,47(10):145-150. [65]GRAHAM B.Spatially-Sparse Convolutional Neural Networks[J].arXiv:1409.6070,2014. [66]GRAHAM B.Sparse 3D Convolutional Neural Networks[J].arXiv:1505.02890,2015. [67]GRAHAM B,VAN DER MAATEN L.Submanifold SparseConvolutional Networks[J].arXiv:1706.01307,2017. [68]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008. [69]HINTON G,VINYALS O,DEAN J,et al.Distilling the Know-ledge in a Neural Network[J].arXiv:1503.02531,2015. [70]YIN T,ZHOU X,KRAHENBUHL P.Center-based 3D Object Detection and Tracking[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.IEEE,2021:11784-11793. [71]ZHOU X,WANG D,KRÄHENBÜHL P.Objects as Points[J].arXiv:1904.07850,2019. [72]CHEN C,CHEN Z,ZHANG J,et al.SASA:Semantics-Augmented Set Abstraction for Point-based 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:11873-11882. [73]ZHANG Y,HU Q,XU G,et al.Not All Points Are Equal:Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Orleans:IEEE,2022:18953-18962. [74]SCARSELLI F,GORI M,TSOI A C,et al.The Graph Neural Network Model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80. [75]KIPF T N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[J].arXiv:1609.02907,2016. [76]BHATTACHARYYA P,CZARNECKI K.Deformable PV-RCNN:Improving 3D Object Detection with Learned Deformations[J].arXiv:2008.08766,2020. [77]ZHAO L,HU J,AN Y P,et al.Deep Learning Based on Semantic Segmentation for Three-Dimensional Object Detection from Point Clouds[J].Chinese Journal of Lasers,2021,48(17):177-189. [78]CHEN X,MA H,WAN J,et al.Multi-View 3D Object Detec-tion Network for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:1907-1915. [79]KU J,MOZIFIAN M,LEE J,et al.Joint 3D Proposal Generation and Object Detection from View Aggregation[C]//Proceedings of the IEEE International Conference on Intelligent Robots and Systems.Madrid:IEEE,2018:1-8. [80]LIANG M,YANG B,WANG S,et al.Deep Continuous Fusion for Multi-Sensor 3D Object Detection[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer,2018:641-656. [81]LIANG M,YANG B,CHEN Y,et al.Multi-Task Multi-Sensor Fusion for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:7345-7353. [82]CUI H,WU J,ZHANG J,et al.3D Detection and Tracking for On-road Vehicles with a Monovision Camera and Dual Low-cost 4D mmWave Radars[C]//Proceedings of the IEEE International Intelligent Transportation Systems Conference.Indianapolis:IEEE,2021:2931-2937. [83]WANG Z,JIA K.Frustum Convnet:Sliding Frustums to Aggregate Local Point-wise Features for Amodal 3D Object Detection[C]//Proceedings of the IEEE International Conference on Intelligent Robots and Systems.Macau:IEEE,2019:1742-1749. [84]LIU X H.3D Object Detection Research Based on Image andPoint Cloud Fusion[D].Shanghai:Donghua University,2020. [85]XIE L,XIANG C,YU Z,et al.PI-RCNN:An Efficient Multi-Sensor 3D Object Detector with Point-based Attentive Cont-Conv Fusion Module[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:12460-12467. [86]VORA S,LANG A H,HELOU B,et al.PointPainting:Sequential Fusion for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:4604-4612. [87]ZHU H,DENG J,ZHANG Y,et al.VPFNet:Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion[J].arXiv:2111.14382,2021. [88]WANG C,MA C,ZHU M,et al.PointAugmenting:Cross-Modal Augmentation for 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2021:11794-11803. [89]PANG S,MORRIS D,RADHA H.CLOCs:Camera-LiDAR Object Candidates Fusion for 3D Object Detection[C]//Procee-dings of the IEEE International Conference on Intelligent Robots and Systems.Las Vegas:IEEE,2020:10386-10393. [90]GEIGER A,LENZ P,URTASUN R.Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Providence:IEEE,2012:3354-3361. [91]GEIGER A,LENZ P,STILLER C,et al.Vision Meets Robo-tics:The KITTI Dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237. [92]CAESAR H,BANKITI V,LANG A H,et al.nuScenes:A Multimodal Dataset for Autonomous Driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11621-11631. [93]SUN P,KRETZSCHMAR H,DOTIWALLA X,et al.Scalability in Perception for Autonomous Driving:Waymo Open Dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:2446-2454. [94]PHILION J,KAR A,FIDLER S.Learning to Evaluate Perception Models Using Planner-Centric Metrics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:14055-14064. [95]GUO Y,CAESAR H,BEIJBOM O,et al.The Efficacy of Neural Planning Metrics:a Meta-Analysis of PKL on nuScenes[J].ar-Xiv:2010.09350,2020. [96]PAEK D,KONG S,WIJAYA,K T.K-Radar:4D Radar Object Detection Dataset and Benchmark for Autonomous Driving in Various Weather Conditions[J].arXiv:2206.08171,2022. |
|