Computer Science ›› 2018, Vol. 45 ›› Issue (11A): 17-26.

• Review • Previous Articles     Next Articles

Overview:Application of Convolution Neural Network in Object Detection

YU Jin-yong1, DING Peng-cheng2, WANG Chao1   

  1. Department of Control Engineering,Naval Aeronautical University,Yantai,Shandong 264001,China1
    Postgraduate Team No.5,Naval Aeronautical University,Yantai,Shandong 264001,China2
  • Online:2019-02-26 Published:2019-02-26

Abstract: As a branch of machine learning,deep learning hasobtained wide application in various fields,and has become a major development direction of speech recognition,natural language processing,information retrieval and other aspects.Especially in image classification and object detection,it has made new breakthroughs.This paper first sorted out the typical applications of convolution neural network in object detection.Secondly,this paper compared several typical convolutional neural network structures,and summed up their advantages and disadvantages.Finally,the existing problems and the future development direction of deep learning were discussed.

Key words: Computer vision, Object detection, Deep learning, Convolutional neural networks

CLC Number: 

  • TP751
[1]LI H,ZHAO R,WANG X.Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for PixelwiseClassification[J].Computer Science,arXiv:1412,4526,2014.
[4]SHEN Y,HE X,GAO J,et al.Learning semantic representations using convolutional neural networks for web search[C]∥International Conference on World Wide Web.ACM,2014:373-374.
[5]GREFENSTETTE E,BLUNSOM P,FREITAS N D,et al.A Deep Architecture for Semantic Parsing[J].Computer Science,2014,30(5):1-15.
[6]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A Convolutional Neural Network for Modelling Sentences[J].ar-Xiv:1404.2188,2014.
[7]KIM Y.Convolutional Neural Networks for Sentence Classification[J].arXiv:1408.5882,2014.
[8]WALLACH I,DZAMBA M,HEIFETS A.AtomNet:A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery[J].Mathematische Zeitschrift,2015,47(1):34-46.
[9]LIU Y,RACAH E,PRABHAT,et al.Application of Deep Convolutional Neural Networks for Detecting Extreme Weather in Climate Datasets[J].arXiv:1605.01156,2016.
[10]CLARK C,STORKEY A.Teaching Deep Convolutional Neural Networks to Play Go[J].arXiv:1412.3409,2014:1766-1774.
[11]FUHL W,SANTINI T,KASNECI G,et al.PupilNet:Convolutional Neural Networks for Robust Pupil Detection[J].Revista De Odontologia Da Unesp,2016,19(1):806-821.
[12]ZHANG X,ZOU J,HE K,et al.Accelerating Very Deep Convolutional Networks for Classification and Detection[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,38(10):1943.
[13]HARIHARAN B,ARBELEZ P,GIRSHICK R,et al.Simultaneous Detection and Segmentation[M]∥Computer Vision-ECCV 2014.Springer International Publishing,2014:297-312.
[15]LIENHART R,MAYDT J.An extended set of Haar-like fea-tures for rapid object detection[C]∥International Conference on Image Processing.IEEE,2002:900-903.
[16]VIOLA P,JONES M.Rapid Object Detection using a Boosted Cascade of Simple Features[C]∥Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2001).IEEE,2003:511-518.
[17]DALAL N,TRIGGS B.Histograms of oriented gradients for human detection[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2005).IEEE,2005:886-893.
[18]CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297.
[19]LIN C F,WANG S D.Fuzzy support vector machines[J].IEEE Transactions on Neural Networks,2002,13(2):464.
[20]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D, et al.Object detection with discriminatively trained part-based models[J].Computer,2014,47(2):6-7.
[22]EVERINGHAM M,ESLAMI S M A,GOOL L V,et al.The Pascal,Visual Object Classes Challenge:A Retrospective[J].International Journal of Computer Vision,2015,111(1):98-136.
[23]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common objects in context[M]∥Computer Vision-ECCV 2014.Springer International Publishing,2014:740-755.
[24]MOTTAGHI R,CHEN X,LIU X,et al.The Role of Context for Object Detection and Semantic Segmentation in the Wild[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2014:891-898.
[25]LIU C,YUEN J,TORRALBA A.Nonparametric scene parsing:Label transfer via dense scene alignment[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,1972:1972-1979.
[26]OTSU N.A thresholding selection method from gray-level histogram[J].IEEE Transactions on Systems Man & Cybernetics,1979,9(1):62-66.
[27]BOVIK A C.On detecting edges in speckle imagery[J].IEEE Transactions on Acoustics Speech & Signal Processing,1988,36(10):1618-1627.
[28]BEZDEK J C.Pattern Recognition with Fuzzy Objective Function Algorithms[M].Plenum,1981.
[29]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]∥Computer Vision and Pattern Recognition.IEEE,2015:3431-3440.
[30]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J].Computer Science,2014(4):357-361.
[31]KOLTUN V.Efficient inference in fully connected CRFs with Gaussian edge potentials[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2011:109-117.
[32]NOH H,HONG S,HAN B.Learning Deconvolution Network for Semantic Segmentation[C]∥IEEE International Conference on Computer Vision.IEEE,2015:1520-1528.
[33]ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional Random Fields as Recurrent Neural Networks[C]∥IEEE International Conference on Computer Vision.IEEE Computer Society,2015:1529-1537.
[34]JEGOU S,DROZDZAL M,VAZQUEZ D,et al.The One Hundred Layers Tiramisu:Fully Convolutional DenseNets for Semantic Segmentation[C]∥Computer Vision and Pattern Recognition Workshops.IEEE,2017:1175-1183.
[35]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Image Net classification with deep convolutional neural networks[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105.
[36]HE K,ZHANG X,REN S,et al.Delving Deep into Rectifiers:Surpassing Human-Level Performance on ImageNet Classification[J].arXiv:1502:01852,2015:1026-1034.
[37]XIE G S,ZHANG X Y,SHU X,et al.Task-driven feature pooling for image classification[C]∥IEEE International Conference on Computer Vision(ICCV).IEEE,2015.
[38]WU R,WANG B,WANG W,et al.Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification[C]∥2015 IEEE International Conference on Computer Vision(ICCVA).IEEE,2015:1287-1295.
[39]KRIZHEVSKY A.Learning Multiple Layers of Features from Tiny Images[J].Handbook of Systemic Autommune Diseases,2009,1(4):1-58.
[40]LI F F,FERGUS R,PERONA P.Learning Generative Visual Models from Few Training Examples:An Incremental Bayesian Approach Tested on 101 Object Categories[C]∥Conference on Computer Vision and Pattern Recognition Workshop(CVPRW’04).IEEE,2005:178-178.
[41]GRIFFIN G,HOLUB A,PERONA P.Caltech-256 Object Category Dataset[R].California Institute of Technology,2007.
[42]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale hierarchical image database[C]∥IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2009).IEEE,2009:248-255.
[43]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2014:1-9.
[44]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,2014.
[45]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]∥Computer Vision and Pattern Recognition.IEEE,2016:770-778.
[46]HUANG G,LIU Z,WEINBERGER K Q.Densely Connected Convolutional Networks[C]∥CVPR.2016.
[47]CHEN Y,LI J,XIAO H,et al.Dual Path Networks[J].arXiv:1707.01629,2017.
[48]EVERINGHAM M,GOOL L V,WILLIAMS C K I,et al.The Pascal Visual Object Classes (VOC) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[49]XIAO J,HAYS J,EHINGER K A,et al.SUN database:Large-scale scene recognition from abbey to zoo[C]∥Computer Vision and Pattern Recognition.IEEE,2010:3485-3492.
[50]UIJLINGS J R R,SANDE K E A V D,GEVERS T,et al.Selective Search for Object Recognition[J].International Journal of Computer Vision,2013,104(2):154-171.
[51]ZITNICK C L,DOLLÁR P.Edge Boxes:Locating Object Proposals from Edges[C]∥European Conference on Computer Vision.Springer,Cham,2014:391-405.
[53]SERMANET P,EIGEN D,ZHANG X,et al.OverFeat:Inte-grated Recognition,Localization and Detection using Convolutional Networks[J].arXiv:1312.6229,2013.
[54]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2014:580-587.
[55]GIRSHICK R.Fast R-CNN[C]∥IEEE International Con-ference on Computer Vision.IEEE Computer Society,2015:1440-1448.
[56]OUYANG W,LOY C C,TANG X,et al.DeepID-Net:Defor-mable deep convolutional neural networks for object detection[C]∥Computer Vision and Pattern Recognition.IEEE,2015:2403-2412.
[57]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[C]∥International Conference on Neural Information Processing Systems.MIT Press,2015:91-99.
[58]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training Re-gion-Based Object Detectors with Online Hard Example Mining[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:761-769.
[59]SUNG KK.Learning and example selection for object and pattern detection[M].Massachusetts Institute of Technology,1996.
[60]YANG F,CHOI W,LIN Y.Exploit All the Layers:Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers[C]∥Computer Vision and Pattern Recognition.IEEE,2016:2129-2137.
[61]BELL S,ZITNICK C L,BALA K,et al.Inside-Outside Net:Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:2874-2883.
[62]BYEON W,BREUEL T M,RAUE F,et al.Scene labeling with LSTM recurrent neural networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:3547-3555.
[63]HE K,GKIOXARI G,DOLLR P,et al.Mask R-CNN[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,PP(99):1.
[64]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2017:936-944.
[65]GOODFELLOW I J,POUGETABADIE J,MIRZA M,et al. Generative Adversarial Networks[J].Advances in Neural Information Processing Systems,2014,3:2672-2680.
[66]LI J,LIANG X,WEI Y,et al.Perceptual Generative Adversarial Networks for Small Object Detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer So-ciety,2017:1951-1959.
[67]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]∥IEEE Confe-rence on Computer Vision and Pattern Recognition.IEEE,2016:779-788.
[68]NAJIBI M,RASTEGARI M,DAVIS L S.G-CNN:An Iterative Grid Based Object Detector[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:2369-2377.
[69]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single Shot MultiBoxDetector[M]∥Computer Vision-ECCV 2016.Springer International Publishing,2016:21-37.
[70]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[J].arXiv:1612.08242,2016:6517-6525.
[71]REN J,CHEN X,LIU J,et al.Accurate Single Stage Detector Using Recurrent Rolling Convolution[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:752-760.
[72]LIPTON Z C,BERKOWITZ J,ELKAN C.A Critical Review of Recurrent Neural Networks for Sequence Learning[J].arXiv:1506.00019,2015.
[73]KARPATHY A,TODERICI G,SHETTY S,et al.Large-Scale Video Classification with Convolutional Neural Networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2014:1725-1732.
[74]JI S,YANG M,YU K.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2012,35(1):221-231.
[75]BACCOUCHE M,MAMALET F,WOLF C,et al.Sequential deep learning for human action recognition[C]∥International Conference on Human Behavior Unterstanding.Springer-Verlag,2011:29-39.
[76]KANG K,LI H,YAN J,et al.T-CNN:Tubelets with Convolutional Neural Networks for Object Detection from Videos[J].arXiv:1604.02532,2016.
[77]ZHU X,XIONG Y,DAI J,et al.Deep Feature Flow for Video Recognition[J].arXiv:1611.07715,2016.
[79]SHOU Z,CHAN J,ZAREIAN A,et al.CDC:Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:1417-1426.
[80]ZEILER M D,FERGUS R.Visualizing and Understanding Convolutional Networks[C]∥European Conference on Computer Vision.Springer,Cham,2014:818-833.
[81]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[82]FELZENSZWALB P,GIRSHICK R,MCALLESTER D,et al.Visual Object Detection with Deformable Part Models[C]∥Computer Vision and Pattern Recognition.IEEE,2010:2241-2248.
[83]GU C,LIM J J,ARBELAEZ P,et al.Recognition using regions[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:1030-1037.
[84]CARREIRA J,SMINCHISESCU C.CPMC:Automatic Object Segmentation Using Constrained Parametric Min-Cuts[M].IEEE Computer Society,2012.
[86]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[C]∥European Conference on Computer Vision.Springer,Cham,2014:346-361.
[87]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[J].arXiv:1605.06409,2016.
[88]RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet Large Scale Visual Recognition Challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[89]LIN M,CHEN Q,YAN S.Network In Network[J].arXiv: 1312.44003v3,2013.
[90]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
[91]DAI J,QI H,XIONG Y,et al.Deformable Convolutional Networks[C]∥IEEE International Conference on Computer Vision.IEEE,2017:764-773.
[1] ZHOU Yan, ZENG Fan-zhi, WU Chen, LUO Yue, LIU Zi-qin. 3D Shape Feature Extraction Method Based on Deep Learning [J]. Computer Science, 2019, 46(9): 47-58.
[2] MA Lu, PEI Wei, ZHU Yong-ying, WANG Chun-li, WANG Peng-qian. Fall Action Recognition Based on Deep Learning [J]. Computer Science, 2019, 46(9): 106-112.
[3] LI Qing-hua, LI Cui-ping, ZHANG Jing, CHEN Hong, WANG Shao-qing. Survey of Compressed Deep Neural Network [J]. Computer Science, 2019, 46(9): 1-14.
[4] WANG Yan-ran, CHEN Qing-liang, WU Jun-jun. Research on Image Semantic Segmentation for Complex Environments [J]. Computer Science, 2019, 46(9): 36-46.
[5] SUN Zhong-feng, WANG Jing. RCNN-BGRU-HN Network Model for Aspect-based Sentiment Analysis [J]. Computer Science, 2019, 46(9): 223-228.
[6] MIAO Yong-wei, LI Gao-yi, BAO Chen, ZHANG Xu-dong, PENG Si-long. Image Localized Style Transfer Based on Convolutional Neural Network [J]. Computer Science, 2019, 46(9): 259-264.
[7] SHI Xiao-hong, HUANG Qin-kai, MIAO Jia-xin, SU Zhuo. Edge-preserving Filtering Method Based on Convolutional Neural Networks [J]. Computer Science, 2019, 46(9): 277-283.
[8] DENG Cun-bin, YU Hui-qun, FAN Gui-sheng. Integrating Dynamic Collaborative Filtering and Deep Learning for Recommendation [J]. Computer Science, 2019, 46(8): 28-34.
[9] DU Wei, DING Shi-fei. Overview on Multi-agent Reinforcement Learning [J]. Computer Science, 2019, 46(8): 1-8.
[10] GUO Xu, ZHU Jing-hua. Deep Neural Network Recommendation Model Based on User Vectorization Representation and Attention Mechanism [J]. Computer Science, 2019, 46(8): 111-115.
[11] ZHANG Yi-jie, LI Pei-feng, ZHU Qiao-ming. Event Temporal Relation Classification Method Based on Self-attention Mechanism [J]. Computer Science, 2019, 46(8): 244-248.
[12] LIU Meng-juan,ZENG Gui-chuan,YUE Wei,QIU Li-zhou,WANG Jia-chang. Review on Click-through Rate Prediction Models for Display Advertising [J]. Computer Science, 2019, 46(7): 38-49.
[13] LI Zhou-jun,WANG Chang-bao. Survey on Deep-learning-based Machine Reading Comprehension [J]. Computer Science, 2019, 46(7): 7-12.
[14] ZHANG Lin-na,CHEN Jian-qiang,CHEN Xiao-ling,CEN Yi-gang,KAN Shi-chao. Lightweight SSD Network for Real-time Object Detection in Automotive Videos [J]. Computer Science, 2019, 46(7): 233-237.
[15] LI Jian, YANG Xiang-ru, HE Bin. Geometric Features Matching with Deep Learning [J]. Computer Science, 2019, 46(7): 274-279.
Full text



[1] . [J]. Computer Science, 2018, 1(1): 1 .
[2] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[3] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[4] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[5] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[6] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[7] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[8] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[9] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[10] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .