Computer Science ›› 2016, Vol. 43 ›› Issue (2): 1-8.doi: 10.11896/j.issn.1002-137X.2016.02.001

    Next Articles

Research and Advances on Deep Learning

SUN Zhi-yuan, LU Cheng-xiang, SHI Zhong-zhi and MA Gang   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Deep learning (DL) is a recently-developed field belonging to machine learning.It tries to mimic the human brain,which is capable of processing the complex input data fast,learning different knowledge intellectually,and solving different kinds of complicated human intelligence tasks well.Recently,with the advent of a fast learning algorithm for DL,the machine learning community set off a surge to study the theory and applications of DL since it has many advantages.Practice shows that deep learning is a kind of high efficient feature extraction method,which can detect more abstract characteristics and realize the essence of the data,and the model constructed by DL tends to have stronger genera-lization ability.Due to the advantages and wide applications of deep learning,this paper attempted to provide a started guide for novice.It presented a detailed instruction of the background and the theoretical principle of deep learning,its emblematic models,its representative learning algorithm,the latest progress and applications.Finally,some research directions of deep learning that are deserved to be further studied were discussed.

Key words: Deep learning,Machine learning,Deep neural network,Image recognition,Speech recognition,Natural language processing

[1] Haykin S.Neural Networks:A Comprehensive Foundation (se-cond edition) [M].N.J.:Prentice Hall,1999
[2] Haykin S.Neural Networks & Learning Machines [M].Upper Saddle River:Pearson Education,2009
[3] Rosenblatt F.The perceptron:a probabilistic model for information storage and organization in the brain [J].Psychological Review,1958,65(6):386
[4] Mo D.A survey on deep learning:one small step toward AI [R].2012
[5] Gori M,Tesi A.On the problem of local minima in backpropagation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1992,14(1):76-86
[6] Hinton G E,Osindero S,Teh Y W.A fast learning algorithm for deep belief nets [J].Neural Computation,2006,18(7):1527-1554
[7] Smolensky P.Information processing in dynamical systems:Foundations of harmony theory [M]∥Parallel Distributed Processing:Explorations in the Microstructure of Cognition.1986:194-281
[8] Freund Y,Haussler D.Unsupervised learning of distributions of binary vectors using two layer networks [R].Santa Cruz:Computer Research Laboratory,University of California,1994
[9] Hinton G E.Training products of experts by minimizing contrastive divergence [J].Neural Computation,2002,14(8):1771-1800
[10] Bengio Y,Lamblin P,Popovici D,et al.Greedy layer-wise training of deep networks [M]∥Advances in Neural Information Processing Systems.2007:153-160
[11] Poultney C,Chopra S,Cun Y L.Efficient learning of sparse representations with an energy-based model [C]∥Proceedings of the 2006 Conference on Advances in Neural Information Processing Systems.2007:1137-1144
[12] Hinton G E,Salakhutdinov R R.Reducing the dimensionality of data with neural networks [J].Science,2006,313(5786):504-507
[13] Bengio Y.Learning deep architectures for AI(Foundations and Trends in Machine Learning)[M].2009
[14] Arel I,Rose D C,Karnowski T P.Deep machine learning—anew frontier in artificial intelligence research [J].Computational Intelligence Magazine,IEEE,2010,5(4):13-18
[15] Hinton G,Deng L,Yu D,et al.Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups [J].Signal Processing Magazine,IEEE,2012,29(6):82-97
[16] Deng L.An overview of deep-structured learning for information processing [C]∥Proceedings of Asian-Pacific Signal & Information Processing Annual Summit and Conference (APSIPA-ASC).2011
[17] Yu D,Deng L.Deep learning and its applications to signal and information processing [J].Signal Processing Magazine,IEEE,2011,28(1):145-154
[18] Bengio Y,Boulanger-Lewandowski N,Pascanu R.Advances in optimizing recurrent networks [C]∥ 2013 IEEE International Conference on Acoustics,Speech,and Signal Processing (ICASSP).Vancouver,Canada,2013
[19] Bengio Y,Courville A,Vincent P.Representation learning:A review and new perspectives [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8):1798-1828
[20] Hinton G E.Learning distributed representations of concepts[C]∥Proceedings of the eighth annual conference of the cognitive science society.1986
[21] Rumelhart D E,Hinton G E,Williams R J.Learning Representations by Back-Propagating Errors [J].Nature,1986,323(6088):533-536
[22] Glorot X,Bengio Y.Understanding the difficulty of trainingdeep feedforward neural networks [C]∥Proceedings of the International Conference on Artificial Intelligence and Statistics.2010
[23] Cortes C,Vapnik V.Support-vector networks [J].MachineLearning,1995,20(3):273-297
[24] Lauer F,Bloch G.Incorporating prior knowledge in support vector machines for classification:A review [J].Neurocomputing,2008,71(7-9):1578-1594
[25] Barbero A,Dorronsoro J R.Momentum sequential minimal optimization:an accelerated method for support vector machine training [C]∥The 2011 International Joint Conference on Neural Networks (IJCNN).San Jose,2011:370-377
[26] Hubel D H,Wiesel T N.Receptive fields,binocular interaction and functional architecture in the cat's visual cortex [J].The Journal of Physiology,1962,160:106-154
[27] Bengio Y,LeCun Y.Scaling learning algorithms towards AI[J].Large-Scale Kernel Machines,2007,34:1-41
[28] Schmidhuber J.Deep Learning in Neural Networks:An Overview [J].Neural Networks,2014,61:85-117
[29] Welling M,Rosen-Zvi M,Hinton G E.Exponential family harmoniums with an application to information retrieval [M]∥Advances in Neural Information Processing Systems.2004:1481-1488
[30] Hopfield J J.Neurons with graded response have collective com-putational properties like those of two-state neurons [J].Proceedings of the National Academy of Sciences,1984,81(10):3088-3092
[31] Liu J S.Monte Carlo strategies in scientific computing [M].Springer,2008
[32] Bouvrie J.Notes on Convolutional Neural Networks[D].Cambridge:MIT,2006
[33] Fukushima K.Neocognitron:A self-organizing neural networkmodel for a mechanism of pattern recognition unaffected by shift in position [J].Biological Cybernetics,1980,36(4):193-202
[34] LeCun Y,Bottou L,Bengio Y,et al.Gradient-based learning applied to document recognition [J].Proceedings of the IEEE,1998,86(11):2278-2324
[35] LeCun Y,Boser B,Denker J S,et al.Backpropagation applied to handwritten zip code recognition [J].Neural Computation,1989,1(4):541-551
[36] Hinton G.A practical guide to training restricted Boltzmann machines [J].Momentum,2010,9(1):926
[37] Chen H,Murray A.A continuous restricted Boltzmann machine with a hardware-amenable learning algorithm [C]∥Artificial Neural Networks—ICANN 2002.Springer,2002:358-363
[38] Chen H,Murray A F.Continuous restricted Boltzmann machine with an implementable training algorithm [J].Iee Proceedings-Vision Image And Signal Processing,2003,150(3):153-158
[39] Lee H,Ekanadham C,Ng A Y.Sparse deep belief net model for visual area V2 [C]∥Proceedings of the Advances in neural information processing systems.2008
[40] Luo H,Shen R,Niu C,et al.Sparse Group Restricted Boltzmann Machines [C]∥Proceedings of the AAAI.2011
[41] Larochelle H,Bengio Y.Classification using discriminative re-stricted Boltzmann machines [C]∥ Proceedings of the 25th international conference on Machine learning.2008:536-543
[42] Larochelle H,Mandel M,Pascanu R,et al.Learning Algorithms for the Classification Restricted Boltzmann Machine [J].Journal of Machine Learning Research,2012,13:643-669
[43] Mrazova I,Kukacka M.Image Classification with Growing Neural Networks [J].International Journal of Computer Theory & Engineering,2013,5(3):422-427
[44] Ranzato M,Huang F J,Boureau Y L,et al.Unsupervised lear-ning of invariant feature hierarchies with applications to object recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2007:1429-1436
[45] Dempster A P,Laird N M,Rubin D B.Maximum Likelihood From Incomplete Data Via Em Algorithm [J].Journal of the Royal Statistical Society Series B-Methodological,1977,39(1):1-38
[46] He K,Zhang X,Ren S,et al.Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[M]∥Computer Vision-ECCV 2014:13th European Conference,Zurich,Swit-zerland,Sep.6-12,4,Proceedings,Part III.2014:346-361
[47] Tieleman T.Training restricted Boltzmann machines using approximations to the likelihood gradient [C]∥Proceedings of the 25th international conference on Machine learning.ACM,2008:1064-1071
[48] Tieleman T,Hinton G.Using fast weights to improve persistent contrastive divergence [C]∥Proceedings of the 26th Annual International Conference on Machine Learning.ACM,2009:1033-1040
[49] Bengio Y,Courville A C,Vincent P.Unsupervised feature lear-ning and deep learning:A review and new perspectives [Z].CoRR abs/12065538,2012
[50] Vincent P,Larochelle H,Bengio Y,et al.Extracting and composing robust features with denoising autoencoders [C]∥Proceedings of the 25th international conference on Machine lear-ning.ACM,2008:1096-1103
[51] Hinton G E,Srivastava N,Krizhevsky A,et al.Improving neuralnetworks by preventing co-adaptation of feature detectors [J].arXiv preprint arXiv:12070580,2012
[52] Jacobs R A.Increased Rates Of Convergence Through Learning Rate Adaptation [J].Neural Networks,1988,1(4):295-307
[53] Sexton R S,Dorsey R E,Johnson J D.Optimization of neuralnetworks:A comparative analysis of the genetic algorithm and simulated annealing [J].European Journal of Operational Research,1999,114(3):589-601
[54] Montana D J,Davis L.Training Feedforward Neural Networks Using Genetic Algorithms [C]∥IJCAI.1989:762-767
[55] Abid S,Fnaiech F,Najim M.A fast feedforward training algorithm using a modified form of the standard backpropagation algorithm [J].IEEE Transactions on Neural Networks,2001,12(2):424-430
[56] Yamamoto Y,Nikiforuk P N.A new supervised learning algorithm for multilayered and interconnected neural networks [J].IEEE Transactions on Neural Networks,2000,11(1):36-46
[57] Yu X H,Efe M O,Kaynak O.A general backpropagation algorithm for feedforward neural networks learning [J].IEEE Transactions on Neural Networks,2002,13(1):251-254
[58] Yam J Y F,Chow T W S.Extended least squares based algorithm for training feedforward networks [J].IEEE Transactions on Neural Networks,1997,8(3):806-810
[59] Ngiam J,Coates A,Lahiri A,et al.On optimization methods for deep learning [C]∥Proceedings of the 28th International Conference on Machine Learning.2011
[60] LeCun Y,Bottou L,Orr G B,et al.Efficient backprop [M]∥Neural Networks:Tricks Of the Trade.1998:9-50
[61] Mohamed A R,Sainath T N,Dahl G,et al.Deep Belief Networks Using Discriminative Features for Phone Recognition [C]∥2011 IEEE International Conference on Acoustics,Speech,And Signal Processing (ICASSP).2011:5060-5063
[62] Mohamed A R,Yu D,Deng L.Investigation of Full-SequenceTraining of Deep Belief Networks for Speech Recognition [C]∥The 11th Annual Conference of the International Speech Communication Association 2010 (Interspeech 2010).2010:2850-2853
[63] Hutchinson B,Deng L,Yu D.A Deep Architecture with Bilinear Modeling of Hidden Representations:Applications To Phonetic Recognition [C]∥2012 IEEE International Conference on Acoustics,Speech And Signal Processing (ICASSP).2012:4805-4808
[64] Torralba A,Fergus R,Weiss Y.Small codes and large image databases for recognition [C]∥2008 IEEE Conference on Computer Vision And Pattern Recognition.2008:2269-2276
[65] Krizhevsky A,Sutskever I,Hinton G E.Imagenet classification with deep convolutional neural networks [C]∥Proceedings of the Advances in neural information processing systems.2012
[66] Bengio Y,Ducharme R,Vincent P,et al.A neural probabilistic language model [J].Journal of Machine Learning Research,2003,3(6):1137-1155
[67] Collobert R,Weston J.A unified architecture for natural lan-guage processing:Deep neural networks with multitask learning [C]∥Proceedings of the 25th international conference on Machine learning.ACM,2008:160-167
[68] Mikolov T.Statistical language models based on neural net-works [D].Brno University of Technology,2012
[69] Deselaers T,Hasan S,Bender O,et al.A deep learning approach to machine transliteration [C]∥Proceedings of the Proceedings of the Fourth Workshop on Statistical Machine Translation.Association for Computational Linguistics,2009:233-241
[70] Sarikaya R,Hinton G E,Ramabhadran B.Deep Belief Nets for Natural Language Call-Routing [C]∥2011 IEEE International Conference on Acoustics,Speech,And Signal Processing.2011:5680-5683
[71] Socher R,Manning C D,Ng A Y.Learning continuous phraserepresentations and syntactic parsing with recursive neural networks [C]∥Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop.2010:1-9
[72] Yu Kai,Jia Lei,Chen Yu-qiang,et al.Deep Learning:Yesterday,Today,and Tomorrow[J].Journal of Computer Research and Development,2013,50(9):1799-1804(in Chinese) 余凯,贾磊,陈雨强,等.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(9):1799-1804
[73] Hasan M,Roy-Chowdhury A K.Continuous Learning of Human Activity Models using Deep Nets[C]∥European Conference on Computer Vision.Springer International Publishing,2014:705-720

No related articles found!
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .