Computer Science ›› 2018, Vol. 45 ›› Issue (9): 11-19.doi: 10.11896/j.issn.1002-137X.2018.09.002

• Surveys • Previous Articles     Next Articles

Research Progress of Object Detection Technology Based on Convolutional Neural Network in Deep Learning

WANG Hui-ling1,2, QI Xiao-long1,2, WU Gang-shan2   

  1. Department of Electronics and Information Engineering,Yili Normal University,Yining,Xinjiang 835000,China1
    Department of Computer Science and Technology,Nanjing University,Nanjing 210023,China2
  • Received:2017-12-12 Online:2018-09-20 Published:2018-10-10

Abstract: Object detection is a hot topic in the field of computer vision.In recent years,convolutional neural network in deep learning has performed prominently in object detection tasks.This paper surveyed the research progress of deep learning in object detection.Firstly,two methods and commonly datasets of object detection were introduced and the advantages of deep learning based on object detection tasks were analyzed.Secondly,according to the development process of the object detection method based on deep learning,the classical convolutional neural network model used in this method was introduced,and the characteristics of each network model were analyzed.Then the aspects of the ability to acquire features,the speed of detection,and theused key technologies were analyzed and summarized.Finally,according to the difficulties and challenges existing in the object detection method based on deep learning and the future development trend,the thinking and outlook were made.

Key words: Convolution neural network, Deep learning, Object detection

CLC Number: 

  • TP183
[1]AGGARWAL J K,RYOO M S.Human activity analysis: A review[J].ACM Computing Surveys (CSUR),2011,43(3):16.
[2]DATTA R,JOSHI D,LI J,et al.Image Retrieval:Ideas,Influe-nces,and Trends of The New Age[J].ACM Computing Surveys (CSUR),2008,40(2):5.
[3]KRÜGER V,KRAGIC D,UDE A,et al.The Meaning of Action:a Review on Action Recognition and Mapping[J].Advanced Robotics,2007,21(13):1473-1501.
[4]PALMESE M,TRUCCO A.From 3-D Sonar Images to Augmented Reality Models for Objects Buried on The Seafloor[J].IEEE Transactions on Instrumentation and Measurement,2008,57(4):820-828.
[5]LI L J,SOCHER R,LI F F.Towards Total Scene Understan-ding:Classification,Annotation and Segmentation in an Automa-tic Framework[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2010:49-56.
[6]BENGIO Y.Learning Deep Architectures for AI[J].Founda-tions and Trends@ in Machine Learning,2009,2(1):1-127.
[7]DENG L.A Tutorial Survey of Architectures,Algorithms,and Applications for Deep Learning[J].APSIPA Transactions on Signal and Information Processing,2014,3(1):1-29.
[8]SCHMIDHUBER J.Deep Learning in Neural Networks:An Overview[J].Neural networks,2015,61(1):85-117.
[9]BENGIO Y.Deep Iearning of Representations:Looking forward[C]∥Proceedings of International Conference on Statistical Language and Speech Processing.Heidelberg:SpringerPress,2013:1-37.
[10]BENGIO Y,COURVILLE A,VINCENT P.Representation
learning:A Review and New Perspectives[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8):1798-1828.
[11]LECUN Y.Learning Invariant Feature Hierarchies[C]∥Proceedings of European Conference on Computer Vision.Heidelberg:Springer,2012:496-505.
[12]MOHAMED A,DAHL G,HINTON G.Deep Belief Networks for Phone Recognition[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2009:39-48.
[13]LOWE D G.Object Recognition FromLocal Scale-Invariant Features[C]∥Proceedings of IEEE International Conference on Computer Vision.New York:IEEE Press,1999:1150-1157.
[14]DALAL N,TRIGGS B.Histograms of Oriented Gradients for Human Detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2005:886-893.
[15]HARRIS C,STEPHENS M.A Combined Corner and Edge Detector[C]∥Proceedings of AlveyVision Conference.Manchester:Springer,1988:147-151.
[16]COLLINS M,SCHAPIRE R E,SINGER Y.Logistic Regression,AdaBoost and BregmanDistances[J].Machine Learning,Springer,2002,48(1-3):253-285.
[17]JOACHIMS T.Making large-scale SVM learning practical:Technical Report,SFB 475[R].Komplexitätsreduktion in Multi-variaten Datenstrukturen,Universität Dortmund,1998.
[18]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object Detection with Discriminatively Trained Part-Based Models[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,32(9):1627.
[19]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-Based Lear-ning Applied to Document Recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[20]HINTON G E,SALAKHUTDINOV R R.Reducing the Dimensionality of Data with Neural Networks[J].Science,2006,313(5786):504-507.
[21]BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy layer-wise training of deep networks[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2006:153-160.
[22]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2012:1097-1105.
[23]DENG J,DONG W,SOCHER R,et al.Imagenet:A Large-Scale Hierarchical Image Database[C]∥Proceedings of IEEE Confe-rence on Computer Vision and Pattern Recognition.New York:IEEE Press,2009:248-255.
[24]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Ima-ge Recognition[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:770-778.
[25]SZEGEDY C,IOFFE S,VANHOUCKE V,et al.Inception-v4,Inception-Resnet and the Impact of Residual Connections on Learning[C]∥Proceedings of AAAI Conference on Artificial Intelligence.Menlo Park,CA :AAAI Press,2017:4-12.
[26]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2014:580-587.
[27]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The Pascal Visual Object Classes (voc) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[28]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[C]∥Proceedings of European Conference on Computer Vision.New York:Springer,2014:740-755.
[29]SERMANET P,EIGEN D,ZHANG X,et al.Overfeat:Integrated Recognition, Localization and Detection Using Convolutional Networks [C]∥International Conference on Learning Representations.New York:IEEE Press,2014:368-384.
[30]ZEILER M D,FERGUS R.Visualizing and Understanding Convolutional Networks[C]∥Proceedings of European Conference on Computer Vision.New York:Springer,2014:818-833.
[31]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[C]∥International Conference on Learning Representations.New York:IEEE Press,2015:1264-1278.
[32]SZEGEDY C,LIU W,JIA Y,et al.Going Deeper With Convolutions[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:1-9.
[33]LIN M,CHEN Q,YAN S.Network in network[C]∥International Conference on Learning Representations.New York:IEEE Press,2014:1567-1577.
[34]IOFFE S,SZEGEDY C.Batch normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]∥Proceedings of International Conference on Machine Learning.Heidelberg:Springer Press,2015:448-456.
[35]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the Inception Architecture for Computer Vision[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:2818-2826.
[36]XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:5987-5995.
[37]BENGIO Y,SIMARD P,FRASCONI P.Learning Long-Term Dependencies with Gradient Descent is Difficult[J].IEEE transactions on neural networks,1994,5(2):157-166.
[38]GLOROT X,BENGIO Y.Understanding the Difficulty of Trai-ning Deep Feedforward Neural Networks[C]∥Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.New York:IEEE Press,2010:249-256.
[39]LECUN Y,BOTTOU L,ORR G B,et al.Efficient backprop[M]∥Neural Networks:Tricks of the Trade.Berlin:Springer Berlin Heidelberg,1998:9-50.
[40]SAXE A M,MCCLELLAND J L,GANGULI S,et al.Exact solutions to the nonlinear dynamics of learning in deep linear neural networks[C]∥ICLR.2014:1-22.
[41]BA J,FREY B.Adaptive Dropout for Training Deep Neural
Networks[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2013:3084-3092.
[42]HE K,SUN J.Convolutional Neural Networks at Constrained Time Cost[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:5353-5360.
[43]SRIVASTAVA R K,GREFF K,SCHMIDHUBER J.Highway Networks [C]∥International Conference on Learning Representations.New York:IEEE Press,2015:567-573.
[44]HUANG G,LIU Z,WEINBERGER K Q,et al.Densely Connected Convolutional Networks[J/OL].https://arXiv.org/abs/1608.06993.
[45]BALDI P,SADOWSKI P J.Understanding Dropout[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2013:2814-2822.
[46]YIN X,GOUDRIAAN J,LANTINGA E A,et al.A flexible Sigmoid Function of Determinate Growth[J].Annals of Botany,2003,91(3):753-753.
[47]XU B,WANG N,CHEN T,et al.Empirical Evaluation of Rectified Activations in Convolutional Network[J/OL].https://arxiv.org/abs/1505.00853 , 2015-3-5/2015-11-27.
[48]GOODFELLOW I J,WARDEFARLEY D,MIRZA M,et al.Maxout Network[C]∥ICML 2013.2013:1319-1327.
[49]UIJLINGS J R R,SANDE K E A V D,GEVERS T,et al.Selective Search for Object Recognition[J].International Journal of Computer Vision,2013,104(2):154-171.
[50]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling in
Deep Convolutional Networks for Visual Recognition[C]∥Proceedings of European Conference on Computer Vision.Heidelberg:Springer Press,2016:21-37.
[51]GIRSHICK R.Fast R-CNN [C]∥Proceedings of IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1440-1448.
[52]REN S,HE K,GIRSHICK R.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[C]∥Proceedings of International Conference on Neural Information Processing Systems.MIT Press,2015:91-99.
[53]KIM K H,HONG S,ROH B,et al.Pvanet:Deep but lightweight neural networks for real-time object detection[J/OL].https://arxiv.org/abs/1608.08021,2016-8-29/2016-9-30.
[54]KONG T,YAO A,CHEN Y,et al.Hypernet:Towards accurate region proposal generation and joint object detection[C]∥Proceedings of the IEEE conference on computer vision and pattern recognition.New York:IEEE Press,2016:845-853.
[55]DAI J,LI Y,HE K,et al.R-fcn:Object Detection Via Region-Based Fully Convolutional Networks[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2016:379-387.
[56]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:936-944.
[57]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:779-788.
[58]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:101-110.
[59]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot
multibox detector[C]∥European conference on computer vision.Cham ,Springer,2016:21-37.
[60]FU C Y,LIU W,RANGA A,et al.DSSD:Deconvolutional Single Shot Detector[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:2301-2312.
[61]WANG X,SHRIVASTAVA A,GUPTA A.A-Fast-RCNN:
Hard Positive Generation via Adversary for Object Detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:2606-2615.
[62]HE X,ZHANG C,ZHANG L,et al.A-Optimal Projection for Image Representation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,38(5):1009-1015.
[63]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.NewYork: IEEE Press,2017:2999-3007.
[64]BODLA N,SINGH B,CHELLAPPA R,et al.Improving Object Detection With One Line of Code[J/OL].https://arxiv.org/abs/1704.04503.
[65]RODRIGUEZ M,LAPTEV I,SIVIC J,et al.Density-Aware Person Detection and Tracking in Crowds[C]∥Proceedings of International Conference on Computer Vision.New York:IEEE Press,2011:2423-2430.
[66]TANG S,ANDRILUKA M,SCHIELE B.Detection and Trac-king of Occluded People[J].International Journal of Computer Vision,2014,110(1):58-69.
[67]REN J,CHEN X,LIU J,et al.Accurate Single Stage Detector Using Recurrent Rolling Convolution[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:5420-5428.
[68]SHRIVASTAVA A,SUKTHANKAR R,MALIK J,et al.Be-yond Skip Connections:Top-Down Modulation for Object Detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:5421-5431.
[69]HE K,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[C]∥Proceedings of International Conference on Computer Vision.New York:IEEE Press,2017:2980-2988.
[70]POIRSON P,AMMIRATO P,FU C Y,et al.Fast Single Shot Detection and Pose Estimation.Fast Single Shot Detection and Pose Estimation[C]∥Proceedings of 3D Vision (3DV).New York:IEEE,Press,2016:676-684.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui. Incremental Object Detection Method Based on Border Distance Measurement [J]. Computer Science, 2022, 49(8): 136-142.
[8] WANG Can, LIU Yong-jian, XIE Qing, MA Yan-chun. Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization [J]. Computer Science, 2022, 49(8): 157-164.
[9] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[10] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[11] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[14] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[15] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!