Computer Science ›› 2020, Vol. 47 ›› Issue (4): 94-102.doi: 10.11896/jsjkx.190400142

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Advances in 3D Object Detection:A Brief Survey

ZHANG Peng, SONG Yi-fan, ZONG Li-bo, LIU Li-bo   

  1. School of Information Engineering,Ningxia University,Yinchuan 750021,China
  • Received:2019-04-26 Online:2020-04-15 Published:2020-04-15
  • Contact: ZHANG Peng,born in 1975,Ph.D,associate professor,is a member of China Computer Federation (CCF).His main research interests include intelligent information processing.
  • Supported by:
    This work was supported by the Research and Innovation Projects of First-class Universities in Western China(ZKZD2017005),Key Research and Development Projects in Ningxia Hui Autonomous Region(2018BBF02006) and National Natural Science Foundation of China (61862050).

Abstract: Object detection is useful in many application scenarios,and is one of the most important research topics in computer vision.In recent years,with the development of deep learning,3D object detection has achieved significant breakthrough.Compared with 2D object detection,3D object detection can provide space scene information such as location,orientation and size of interest object,which plays an important role in autonomous driving and robot research.This paper firstly summarized deep lear-ning-based 2D object detection,then reviewed recent novel 3D object detection algorithms based on different data type of image,point cloud and multi-sensors,and analyzd performances,advantages and limitations of typical 3D object detection algorithms in autonomous driving scenario.Finally,this paper summarized the application direction and research topics and challenges of 3D object detection.

Key words: Computer vision, Deep learning, Object detection

CLC Number: 

  • TP751
[1]LI L J,SOCHER R,LI F F.Towards total scene understanding:Classification,annotation and segmentation in an automatic framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:2036-2043.
[2]YILMAZ A,JAVED O,SHAH M.Object tracking:A Survey[J].ACM Computing Surveys,2006,38(4):1-45.
[3]KE Y,SUKTHANKAR R,HEBERT M.Event detection in crowded videos[C]//Proceedings of the IEEE International Conference on Computer Vision.2007:1-8.
[4]LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,1999:1150-1157.
[5]PAPAGEORGIOU C,POGGIO T.A Trainable system for object detection[J].International Journal of Computer Vision,2000,38(1):15-33.
[6]DALAL N,TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2005:886-893.
[7]SERMANET P,EIGEN D,ZHANG X,et al.OverFeat:Integrated Recognition,Localization and Detection using Convolutional Networks[J].arXiv:1312.6229v4.
[8]YANG M H,KRIEGMAN D J,AHUJA N.Detecting faces in images:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(1):34-58.
[9]ENZWEILER M,GAVRILA D M.Monocular pedestrian detection:Survey and experiments[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(12):2179-2195.
[10]SUN Z,BEBIS G,MILLER R.On-road vehicle detection:A review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):694-711.
[11]JAIMES A,SEBE N.Multimodal human-computer interaction:A survey[J].Computer Vision and Image Understanding,2007,108(1/2):116-134.
[12]YE Q,DOERMANN D.Text Detection and Recognition in Imagery:A Survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(7):1480-1500.
[13]HELD C,KRUMM J,MARKEL P,et al.Intelligent video surveillance[J].Computer,2012,45(3):83-84.
[14]MOUSAVIAN A,ANGUELOV D,FLYNN J,et al.3D bounding box estimation using deep learning and geometry[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5632-5640.
[15]JONES M,VIOLA P.Robust real-time object detection[J].International Journal of Computer Vision,2001,57:137-1154.
[16]VIOLA P,JONES M J.Robust Real-Time Face Detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[17]ZHU Q,AVIDAN S,YEH M C,et al.Fast human detection using a cascade of histograms of oriented gradients[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2006:1491-1498.
[18]FELZENSZWALB P,MCALLESTER D,RAMANAN D.A discriminatively trained,multiscale,deformable part model[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8.
[19]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D.Cascade object detection with deformable part models[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2010:2241-2248.
[20]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[21]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[22]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2014:580-587.
[23]UIJLINGS J R R,VAN DE SANDE K E A,GEVERS T,et al.Selective search for object recognition[J].International Journal of Computer Vision,2013,104(2):154-157.
[24]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[25]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[26]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[27]LIN T Y,DOLLáR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:936-944.
[28]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:779-788.
[29]REDMON J,FARHADI A.YOLO9000:Better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6517-6525.
[30]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767v1.
[31]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Springer,2016,9905 LNCS:21-37.
[32]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,2017:2999-3007.
[33]PASCAL.Leaderboards for the Evaluations on PASCAL VOC Data[EB/OL].http://host.robots.ox.ac.uk:8080/leaderboard/main_bootstrap.php.
[34]Object Detection[EB/OL].https://handong1587.github.io/ deep_learning/2015/10/09/object-detection.html.
[35]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? the KITTI vision benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:3354-3361.
[36]GUPTA S,GIRSHICK R,ARBELÁEZ P,et al.Learning richfeatures from RGB-D images for object detection and segmentation[C]//European Conference on Computer Vision.Cham:Springer,2014:345-360.
[37]CHEN X,ZHU Y.3D Object Proposals for Accurate Object Class Detection[C]//Advances in Neural Information Processing Systems.2015:1-9.
[38]SONG S,XIAO J.Sliding shapes for 3D object detection in depth images[C]//European Conference on Computer Vision.Springer,2014:634-651.
[39]CAI Q,WEI L W,LI H S.Object Detection in RGB-D Image Based on ANNet[J].Journal of System Simulation,2016,28(9):2260-2266.
[40]LUO J,JIANG M,LIU X,et al.RGB-D object recogonition based on multimodal deep learning[J].Computer Engineering and Design,2017,38(6):1624-1629.
[41]SONG S,XIAO J.Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2016:808-816.
[42]DENG Z,LATECKI L J.Amodal detection of 3D objects:Inferring 3D bounding boxes from 2D ones in RGB-depth images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2017:398-406.
[43]ZHAO X,GUO W,LIU J.Object detection adopting sub-step merging of super-pixel and multi-modal fusion in RGB-D[J].Journal of Image and Graphics,2018,23(8):1231-1241.
[44]DOUILLARD B,UNDERWOOD J,KUNTZ N,et al.On the segmentation of 3D lidar point clouds[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2011:2798-2805.
[45]KLASING K,WOLLHERR D,BUSS M.A clustering method for efficient segmentation of 3D laser data[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2008:4043-4048.
[46]PAPON J,ABRAMOV A,SCHOELER M,et al.Voxel cloud connectivity segmentation - Supervoxels for point clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2013:2027-2034.
[47]LI B,ZHANG T,XIA T.Vehicle Detection from 3D Lidar Using Fully Convolutional Network[C]// Robotics:Science and Systems.2016.
[48]LI B.3D fully convolutional network for vehicle detection in point cloud[C]//International Conference on Intelligent Robots and Systems.IEEE,2017:1513-1518.
[49]ZENG WANG D,POSNER I.Voting for Voting in Online Point Cloud Object Detection[C]//Robotics:Science and Systems.2015.
[50]ENGELCKE M,RAO D,WANG D Z,et al.Vote3Deep:Fastobject detection in 3D point clouds using efficient convolutional neural networks[C]//Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2017:1355-1361.
[51]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:77-85.
[52]QI C R,YI L,SU H,et al.Pointnet++:Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems.2017:5099-5108.
[53]WANG X J,MA J,WANG N N,et al.Deep Learning Model for Point Clouds Classification Based on Graph Convolutional Network[J/OL].Laser & Optoelectronics Progress:1-c9.http://kns.nki.net/kcms/detail/31.1690.TN.20190508.1151.042.html.
[54]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:918-927.
[55]WANG W,YU R,HUANG Q,et al.SGPN:Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:2569-2578.
[56]ALI W,ABDELKARIM S,ZAHRAN M,et al.YOLO3D:End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//European Conference on Computer Vision.Springer,2018.
[57]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:4490-4499.
[58]CRAMER H,SCHEUNERT U,WANIELIK G.Multi sensor fusion for object detection using generalized feature models[C]//Proceedings of the International Conference of Information Fusion.IEEE,2003.
[59]CHO H,SEO Y W,KUMAR B V K V,et al.A multi-sensor fusion system for moving object detection and tracking in urban driving environments[C]//Proceedings of the IEEE International Conference on Robotics and Automation.2014:1836-1843.
[60]CHEN X,MA H,WAN J,et al.Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6526-6534.
[61]LIANG M,YANG B,WANG S,et al.Deep Continuous Fusion for Multi-sensor 3D Object Detection[C]//European Conference on Computer Vision.Springer,2018:663-678.
[62]KIT.The KITTI Vision Benchmark Suite[EB/OL].http://www.cvlibs.net/datasets/kitti/eval_object.php.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui. Incremental Object Detection Method Based on Border Distance Measurement [J]. Computer Science, 2022, 49(8): 136-142.
[8] WANG Can, LIU Yong-jian, XIE Qing, MA Yan-chun. Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization [J]. Computer Science, 2022, 49(8): 157-164.
[9] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[10] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[11] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[12] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[13] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[14] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[15] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!