3D目标检测进展综述

doi:10.11896/jsjkx.190400142

摘要/Abstract

摘要： 目标检测算法应用广泛,一直是计算机视觉领域备受关注的研究热点。近年来,随着深度学习的发展,3D图像的目标检测研究取得了巨大的突破。与2D目标检测相比,3D目标检测结合了深度信息,能够提供目标的位置、方向和大小等空间场景信息,在自动驾驶和机器人领域发展迅速。文中首先对基于深度学习的2D目标检测算法进行概述;其次根据图像、激光雷达、多传感器等不同数据采集方式,分析目前具有代表性和开创性的3D目标检测算法;结合自动驾驶的应用场景,对比分析不同 3D 目标检测算法的性能、优势和局限性;最后总结了3D目标检测的应用意义以及待解决的问题,并对 3D 目标检测的发展方向和新的挑战进行了讨论和展望。

关键词: 计算机视觉, 目标检测, 深度学习

Abstract: Object detection is useful in many application scenarios,and is one of the most important research topics in computer vision.In recent years,with the development of deep learning,3D object detection has achieved significant breakthrough.Compared with 2D object detection,3D object detection can provide space scene information such as location,orientation and size of interest object,which plays an important role in autonomous driving and robot research.This paper firstly summarized deep lear-ning-based 2D object detection,then reviewed recent novel 3D object detection algorithms based on different data type of image,point cloud and multi-sensors,and analyzd performances,advantages and limitations of typical 3D object detection algorithms in autonomous driving scenario.Finally,this paper summarized the application direction and research topics and challenges of 3D object detection.

Key words: Computer vision, Deep learning, Object detection

中图分类号:

TP751

张鹏, 宋一凡, 宗立波, 刘立波. 3D目标检测进展综述[J]. 计算机科学, 2020, 47(4): 94-102. https://doi.org/10.11896/jsjkx.190400142

ZHANG Peng, SONG Yi-fan, ZONG Li-bo, LIU Li-bo. Advances in 3D Object Detection:A Brief Survey[J]. Computer Science, 2020, 47(4): 94-102. https://doi.org/10.11896/jsjkx.190400142

参考文献

[1]LI L J,SOCHER R,LI F F.Towards total scene understanding:Classification,annotation and segmentation in an automatic framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:2036-2043.
[2]YILMAZ A,JAVED O,SHAH M.Object tracking:A Survey[J].ACM Computing Surveys,2006,38(4):1-45.
[3]KE Y,SUKTHANKAR R,HEBERT M.Event detection in crowded videos[C]//Proceedings of the IEEE International Conference on Computer Vision.2007:1-8.
[4]LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,1999:1150-1157.
[5]PAPAGEORGIOU C,POGGIO T.A Trainable system for object detection[J].International Journal of Computer Vision,2000,38(1):15-33.
[6]DALAL N,TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2005:886-893.
[7]SERMANET P,EIGEN D,ZHANG X,et al.OverFeat:Integrated Recognition,Localization and Detection using Convolutional Networks[J].arXiv:1312.6229v4.
[8]YANG M H,KRIEGMAN D J,AHUJA N.Detecting faces in images:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(1):34-58.
[9]ENZWEILER M,GAVRILA D M.Monocular pedestrian detection:Survey and experiments[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(12):2179-2195.
[10]SUN Z,BEBIS G,MILLER R.On-road vehicle detection:A review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):694-711.
[11]JAIMES A,SEBE N.Multimodal human-computer interaction:A survey[J].Computer Vision and Image Understanding,2007,108(1／2):116-134.
[12]YE Q,DOERMANN D.Text Detection and Recognition in Imagery:A Survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(7):1480-1500.
[13]HELD C,KRUMM J,MARKEL P,et al.Intelligent video surveillance[J].Computer,2012,45(3):83-84.
[14]MOUSAVIAN A,ANGUELOV D,FLYNN J,et al.3D bounding box estimation using deep learning and geometry[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5632-5640.
[15]JONES M,VIOLA P.Robust real-time object detection[J].International Journal of Computer Vision,2001,57:137-1154.
[16]VIOLA P,JONES M J.Robust Real-Time Face Detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[17]ZHU Q,AVIDAN S,YEH M C,et al.Fast human detection using a cascade of histograms of oriented gradients[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2006:1491-1498.
[18]FELZENSZWALB P,MCALLESTER D,RAMANAN D.A discriminatively trained,multiscale,deformable part model[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8.
[19]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D.Cascade object detection with deformable part models[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2010:2241-2248.
[20]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[21]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[22]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2014:580-587.
[23]UIJLINGS J R R,VAN DE SANDE K E A,GEVERS T,et al.Selective search for object recognition[J].International Journal of Computer Vision,2013,104(2):154-157.
[24]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[25]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[26]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[27]LIN T Y,DOLLáR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:936-944.
[28]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:779-788.
[29]REDMON J,FARHADI A.YOLO9000:Better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6517-6525.
[30]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767v1.
[31]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Springer,2016,9905 LNCS:21-37.
[32]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,2017:2999-3007.
[33]PASCAL.Leaderboards for the Evaluations on PASCAL VOC Data[EB/OL].http://host.robots.ox.ac.uk:8080/leaderboard/main_bootstrap.php.
[34]Object Detection[EB/OL].https://handong1587.github.io/ deep_learning/2015/10/09/object-detection.html.
[35]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? the KITTI vision benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:3354-3361.
[36]GUPTA S,GIRSHICK R,ARBELÁEZ P,et al.Learning richfeatures from RGB-D images for object detection and segmentation[C]//European Conference on Computer Vision.Cham:Springer,2014:345-360.
[37]CHEN X,ZHU Y.3D Object Proposals for Accurate Object Class Detection[C]//Advances in Neural Information Processing Systems.2015:1-9.
[38]SONG S,XIAO J.Sliding shapes for 3D object detection in depth images[C]//European Conference on Computer Vision.Springer,2014:634-651.
[39]CAI Q,WEI L W,LI H S.Object Detection in RGB-D Image Based on ANNet[J].Journal of System Simulation,2016,28(9):2260-2266.
[40]LUO J,JIANG M,LIU X,et al.RGB-D object recogonition based on multimodal deep learning[J].Computer Engineering and Design,2017,38(6):1624-1629.
[41]SONG S,XIAO J.Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2016:808-816.
[42]DENG Z,LATECKI L J.Amodal detection of 3D objects:Inferring 3D bounding boxes from 2D ones in RGB-depth images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2017:398-406.
[43]ZHAO X,GUO W,LIU J.Object detection adopting sub-step merging of super-pixel and multi-modal fusion in RGB-D[J].Journal of Image and Graphics,2018,23(8):1231-1241.
[44]DOUILLARD B,UNDERWOOD J,KUNTZ N,et al.On the segmentation of 3D lidar point clouds[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2011:2798-2805.
[45]KLASING K,WOLLHERR D,BUSS M.A clustering method for efficient segmentation of 3D laser data[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2008:4043-4048.
[46]PAPON J,ABRAMOV A,SCHOELER M,et al.Voxel cloud connectivity segmentation - Supervoxels for point clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2013:2027-2034.
[47]LI B,ZHANG T,XIA T.Vehicle Detection from 3D Lidar Using Fully Convolutional Network[C]// Robotics:Science and Systems.2016.
[48]LI B.3D fully convolutional network for vehicle detection in point cloud[C]//International Conference on Intelligent Robots and Systems.IEEE,2017:1513-1518.
[49]ZENG WANG D,POSNER I.Voting for Voting in Online Point Cloud Object Detection[C]//Robotics:Science and Systems.2015.
[50]ENGELCKE M,RAO D,WANG D Z,et al.Vote3Deep:Fastobject detection in 3D point clouds using efficient convolutional neural networks[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2017:1355-1361.
[51]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:77-85.
[52]QI C R,YI L,SU H,et al.Pointnet++:Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems.2017:5099-5108.
[53]WANG X J,MA J,WANG N N,et al.Deep Learning Model for Point Clouds Classification Based on Graph Convolutional Network[J/OL].Laser & Optoelectronics Progress:1-c9.http://kns.nki.net/kcms/detail/31.1690.TN.20190508.1151.042.html.
[54]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:918-927.
[55]WANG W,YU R,HUANG Q,et al.SGPN:Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:2569-2578.
[56]ALI W,ABDELKARIM S,ZAHRAN M,et al.YOLO3D:End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//European Conference on Computer Vision.Springer,2018.
[57]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:4490-4499.
[58]CRAMER H,SCHEUNERT U,WANIELIK G.Multi sensor fusion for object detection using generalized feature models[C]//Proceedings of the International Conference of Information Fusion.IEEE,2003.
[59]CHO H,SEO Y W,KUMAR B V K V,et al.A multi-sensor fusion system for moving object detection and tracking in urban driving environments[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2014:1836-1843.
[60]CHEN X,MA H,WAN J,et al.Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6526-6534.
[61]LIANG M,YANG B,WANG S,et al.Deep Continuous Fusion for Multi-sensor 3D Object Detection[C]//European Conference on Computer Vision.Springer,2018:663-678.
[62]KIT.The KITTI Vision Benchmark Suite[EB/OL].http://www.cvlibs.net/datasets/kitti/eval_object.php.

相关文章 15

[1]	徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3]	汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4]	刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉. 基于边框距离度量的增量目标检测方法 Incremental Object Detection Method Based on Border Distance Measurement 计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[5]	王灿, 刘永坚, 解庆, 马艳春. 基于软标签和样本权重优化的Anchor Free目标检测算法 Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization 计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[6]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[7]	王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[8]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[9]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10]	胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[11]	程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[12]	侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[13]	周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[14]	苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[15]	王君锋, 刘凡, 杨赛, 吕坦悦, 陈峙宇, 许峰. 基于多源迁移学习的大坝裂缝检测 Dam Crack Detection Based on Multi-source Transfer Learning 计算机科学, 2022, 49(6A): 319-324. https://doi.org/10.11896/jsjkx.210500124

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed