计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 94-102.doi: 10.11896/jsjkx.190400142

• 计算机图形学&多媒体 • 上一篇    下一篇

3D目标检测进展综述

张鹏, 宋一凡, 宗立波, 刘立波   

  1. 宁夏大学信息工程学院 银川750021
  • 收稿日期:2019-04-26 出版日期:2020-04-15 发布日期:2020-04-15
  • 通讯作者: 张鹏(pengzhang123@foxmail.com)
  • 基金资助:
    西部一流大学科研创新项目(ZKZD2017005);宁夏回族自治区重点研发项目(2018BBF02006);国家自然科学基金(61862050)

Advances in 3D Object Detection:A Brief Survey

ZHANG Peng, SONG Yi-fan, ZONG Li-bo, LIU Li-bo   

  1. School of Information Engineering,Ningxia University,Yinchuan 750021,China
  • Received:2019-04-26 Online:2020-04-15 Published:2020-04-15
  • Contact: ZHANG Peng,born in 1975,Ph.D,associate professor,is a member of China Computer Federation (CCF).His main research interests include intelligent information processing.
  • Supported by:
    This work was supported by the Research and Innovation Projects of First-class Universities in Western China(ZKZD2017005),Key Research and Development Projects in Ningxia Hui Autonomous Region(2018BBF02006) and National Natural Science Foundation of China (61862050).

摘要: 目标检测算法应用广泛,一直是计算机视觉领域备受关注的研究热点。近年来,随着深度学习的发展,3D图像的目标检测研究取得了巨大的突破。与2D目标检测相比,3D目标检测结合了深度信息,能够提供目标的位置、方向和大小等空间场景信息,在自动驾驶和机器人领域发展迅速。文中首先对基于深度学习的2D目标检测算法进行概述;其次根据图像、激光雷达、多传感器等不同数据采集方式,分析目前具有代表性和开创性的3D目标检测算法;结合自动驾驶的应用场景,对比分析不同 3D 目标检测算法的性能、优势和局限性;最后总结了3D目标检测的应用意义以及待解决的问题,并对 3D 目标检测的发展方向和新的挑战进行了讨论和展望。

关键词: 深度学习, 计算机视觉, 目标检测

Abstract: Object detection is useful in many application scenarios,and is one of the most important research topics in computer vision.In recent years,with the development of deep learning,3D object detection has achieved significant breakthrough.Compared with 2D object detection,3D object detection can provide space scene information such as location,orientation and size of interest object,which plays an important role in autonomous driving and robot research.This paper firstly summarized deep lear-ning-based 2D object detection,then reviewed recent novel 3D object detection algorithms based on different data type of image,point cloud and multi-sensors,and analyzd performances,advantages and limitations of typical 3D object detection algorithms in autonomous driving scenario.Finally,this paper summarized the application direction and research topics and challenges of 3D object detection.

Key words: Deep learning, Computer vision, Object detection

中图分类号: 

  • TP751
[1]LI L J,SOCHER R,LI F F.Towards total scene understanding:Classification,annotation and segmentation in an automatic framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:2036-2043.
[2]YILMAZ A,JAVED O,SHAH M.Object tracking:A Survey[J].ACM Computing Surveys,2006,38(4):1-45.
[3]KE Y,SUKTHANKAR R,HEBERT M.Event detection in crowded videos[C]//Proceedings of the IEEE International Conference on Computer Vision.2007:1-8.
[4]LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,1999:1150-1157.
[5]PAPAGEORGIOU C,POGGIO T.A Trainable system for object detection[J].International Journal of Computer Vision,2000,38(1):15-33.
[6]DALAL N,TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2005:886-893.
[7]SERMANET P,EIGEN D,ZHANG X,et al.OverFeat:Integrated Recognition,Localization and Detection using Convolutional Networks[J].arXiv:1312.6229v4.
[8]YANG M H,KRIEGMAN D J,AHUJA N.Detecting faces in images:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(1):34-58.
[9]ENZWEILER M,GAVRILA D M.Monocular pedestrian detection:Survey and experiments[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(12):2179-2195.
[10]SUN Z,BEBIS G,MILLER R.On-road vehicle detection:A review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):694-711.
[11]JAIMES A,SEBE N.Multimodal human-computer interaction:A survey[J].Computer Vision and Image Understanding,2007,108(1/2):116-134.
[12]YE Q,DOERMANN D.Text Detection and Recognition in Imagery:A Survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(7):1480-1500.
[13]HELD C,KRUMM J,MARKEL P,et al.Intelligent video surveillance[J].Computer,2012,45(3):83-84.
[14]MOUSAVIAN A,ANGUELOV D,FLYNN J,et al.3D bounding box estimation using deep learning and geometry[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5632-5640.
[15]JONES M,VIOLA P.Robust real-time object detection[J].International Journal of Computer Vision,2001,57:137-1154.
[16]VIOLA P,JONES M J.Robust Real-Time Face Detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[17]ZHU Q,AVIDAN S,YEH M C,et al.Fast human detection using a cascade of histograms of oriented gradients[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2006:1491-1498.
[18]FELZENSZWALB P,MCALLESTER D,RAMANAN D.A discriminatively trained,multiscale,deformable part model[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8.
[19]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D.Cascade object detection with deformable part models[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2010:2241-2248.
[20]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[21]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[22]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2014:580-587.
[23]UIJLINGS J R R,VAN DE SANDE K E A,GEVERS T,et al.Selective search for object recognition[J].International Journal of Computer Vision,2013,104(2):154-157.
[24]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[25]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[26]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[27]LIN T Y,DOLLáR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:936-944.
[28]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:779-788.
[29]REDMON J,FARHADI A.YOLO9000:Better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6517-6525.
[30]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767v1.
[31]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Springer,2016,9905 LNCS:21-37.
[32]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,2017:2999-3007.
[33]PASCAL.Leaderboards for the Evaluations on PASCAL VOC Data[EB/OL].http://host.robots.ox.ac.uk:8080/leaderboard/main_bootstrap.php.
[34]Object Detection[EB/OL].https://handong1587.github.io/ deep_learning/2015/10/09/object-detection.html.
[35]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? the KITTI vision benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:3354-3361.
[36]GUPTA S,GIRSHICK R,ARBELÁEZ P,et al.Learning richfeatures from RGB-D images for object detection and segmentation[C]//European Conference on Computer Vision.Cham:Springer,2014:345-360.
[37]CHEN X,ZHU Y.3D Object Proposals for Accurate Object Class Detection[C]//Advances in Neural Information Processing Systems.2015:1-9.
[38]SONG S,XIAO J.Sliding shapes for 3D object detection in depth images[C]//European Conference on Computer Vision.Springer,2014:634-651.
[39]CAI Q,WEI L W,LI H S.Object Detection in RGB-D Image Based on ANNet[J].Journal of System Simulation,2016,28(9):2260-2266.
[40]LUO J,JIANG M,LIU X,et al.RGB-D object recogonition based on multimodal deep learning[J].Computer Engineering and Design,2017,38(6):1624-1629.
[41]SONG S,XIAO J.Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2016:808-816.
[42]DENG Z,LATECKI L J.Amodal detection of 3D objects:Inferring 3D bounding boxes from 2D ones in RGB-depth images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2017:398-406.
[43]ZHAO X,GUO W,LIU J.Object detection adopting sub-step merging of super-pixel and multi-modal fusion in RGB-D[J].Journal of Image and Graphics,2018,23(8):1231-1241.
[44]DOUILLARD B,UNDERWOOD J,KUNTZ N,et al.On the segmentation of 3D lidar point clouds[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2011:2798-2805.
[45]KLASING K,WOLLHERR D,BUSS M.A clustering method for efficient segmentation of 3D laser data[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2008:4043-4048.
[46]PAPON J,ABRAMOV A,SCHOELER M,et al.Voxel cloud connectivity segmentation - Supervoxels for point clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2013:2027-2034.
[47]LI B,ZHANG T,XIA T.Vehicle Detection from 3D Lidar Using Fully Convolutional Network[C]// Robotics:Science and Systems.2016.
[48]LI B.3D fully convolutional network for vehicle detection in point cloud[C]//International Conference on Intelligent Robots and Systems.IEEE,2017:1513-1518.
[49]ZENG WANG D,POSNER I.Voting for Voting in Online Point Cloud Object Detection[C]//Robotics:Science and Systems.2015.
[50]ENGELCKE M,RAO D,WANG D Z,et al.Vote3Deep:Fastobject detection in 3D point clouds using efficient convolutional neural networks[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2017:1355-1361.
[51]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:77-85.
[52]QI C R,YI L,SU H,et al.Pointnet++:Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems.2017:5099-5108.
[53]WANG X J,MA J,WANG N N,et al.Deep Learning Model for Point Clouds Classification Based on Graph Convolutional Network[J/OL].Laser & Optoelectronics Progress:1-c9.http://kns.nki.net/kcms/detail/31.1690.TN.20190508.1151.042.html.
[54]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:918-927.
[55]WANG W,YU R,HUANG Q,et al.SGPN:Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:2569-2578.
[56]ALI W,ABDELKARIM S,ZAHRAN M,et al.YOLO3D:End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//European Conference on Computer Vision.Springer,2018.
[57]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:4490-4499.
[58]CRAMER H,SCHEUNERT U,WANIELIK G.Multi sensor fusion for object detection using generalized feature models[C]//Proceedings of the International Conference of Information Fusion.IEEE,2003.
[59]CHO H,SEO Y W,KUMAR B V K V,et al.A multi-sensor fusion system for moving object detection and tracking in urban driving environments[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2014:1836-1843.
[60]CHEN X,MA H,WAN J,et al.Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6526-6534.
[61]LIANG M,YANG B,WANG S,et al.Deep Continuous Fusion for Multi-sensor 3D Object Detection[C]//European Conference on Computer Vision.Springer,2018:663-678.
[62]KIT.The KITTI Vision Benchmark Suite[EB/OL].http://www.cvlibs.net/datasets/kitti/eval_object.php.
[1] 单美静, 秦龙飞, 张会兵. L-YOLO:适用于车载边缘计算的实时交通标识检测模型[J]. 计算机科学, 2021, 48(1): 89-95.
[2] 何彦辉, 吴桂兴, 吴志强. 基于域适应的X光图像的目标检测[J]. 计算机科学, 2021, 48(1): 175-181.
[3] 王瑞平, 贾真, 刘畅, 陈泽威, 李天瑞. 基于DeepFM的深度兴趣因子分解机网络[J]. 计算机科学, 2021, 48(1): 226-232.
[4] 于文家, 丁世飞. 基于自注意力机制的条件生成对抗网络[J]. 计算机科学, 2021, 48(1): 241-246.
[5] 仝鑫, 王斌君, 王润正, 潘孝勤. 面向自然语言处理的深度学习对抗样本综述[J]. 计算机科学, 2021, 48(1): 258-267.
[6] 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59.
[7] 何鑫, 许娟, 金莹莹. 行为关联网络:完整的变化行为建模[J]. 计算机科学, 2020, 47(9): 123-128.
[8] 叶亚男, 迟静, 于志平, 战玉丽, 张彩明. 基于改进CycleGan模型和区域分割的表情动画合成[J]. 计算机科学, 2020, 47(9): 142-149.
[9] 邓良, 许庚林, 李梦杰, 陈章进. 基于深度学习与多哈希相似度加权实现快速人脸识别[J]. 计算机科学, 2020, 47(9): 163-168.
[10] 齐少华, 徐和根, 万友文, 付豪. 动态环境下的语义地图构建[J]. 计算机科学, 2020, 47(9): 198-203.
[11] 暴雨轩, 芦天亮, 杜彦辉. 深度伪造视频检测技术综述[J]. 计算机科学, 2020, 47(9): 283-292.
[12] 姚兰, 赵永恒, 施雨晴, 于明鹤. 一种基于视频分析的高速公路交通异常事件检测算法[J]. 计算机科学, 2020, 47(8): 208-212.
[13] 袁野, 和晓歌, 朱定坤, 王富利, 谢浩然, 汪俊, 魏明强, 郭延文. 视觉图像显著性检测综述[J]. 计算机科学, 2020, 47(7): 84-91.
[14] 王文刀, 王润泽, 魏鑫磊, 漆云亮, 马义德. 基于堆叠式双向LSTM的心电图自动识别算法[J]. 计算机科学, 2020, 47(7): 118-124.
[15] 刘燕, 温静. 基于注意力机制的复杂场景文本检测[J]. 计算机科学, 2020, 47(7): 135-140.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[3] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[4] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[5] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[6] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[7] 刘博艺,唐湘滟,程杰仁. 基于多生长时期模板匹配的玉米螟识别方法[J]. 计算机科学, 2018, 45(4): 106 -111 .
[8] 耿海军,施新刚,王之梁,尹霞,尹少平. 基于有向无环图的互联网域内节能路由算法[J]. 计算机科学, 2018, 45(4): 112 -116 .
[9] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[10] 王振朝,侯欢欢,连蕊. 抑制CMT中乱序程度的路径优化方案[J]. 计算机科学, 2018, 45(4): 122 -125 .