Computer Science ›› 2018, Vol. 45 ›› Issue (11A): 155-159.

• Intelligent Computing • Previous Articles     Next Articles

Research on Optimization Algorithm of Deep Learning

TONG Wei-guo, LI Min-xia, ZHANG Yi-ke   

  1. Department of Automation,North China Electric Power University,Baoding,Hebei 071003,China
  • Online:2019-02-26 Published:2019-02-26

Abstract: Deep learning is a hot research field in machine learning.Training and optimization algorithm of deep lear-ning have also been high concern and studied,and has become an important driving force for the development of artificial intelligence.Based on the basic structure of convolution neural network,the selection of activation function,the setting of hyperparameters and optimization algorithms in network training were introduced in this paper.The advantages and disadvantages of each training and optimization algorithm were analyzed and verified by Cifar-10 data set as training samples.Experimental results show that the appropriate training methods and optimization algorithms can effectively improve the accuracy and convergence of the network.Finally,the optimal algorithm was applied in the image recognition of actual transmission line and achieved good result.

Key words: Activate function, Convolution neural network, Deep learning, Hyperparameter, Optimization algorithm, Regularization

CLC Number: 

  • TP391
[1]刘建伟,刘媛,罗雄麟.深度学习研究进展[J].计算机应用研究,2014,31(7):1921-1930.
[2]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[J].Advances in Neural Information Processing Systems,2012,25(2):2012.
[3]SUTSKEVER I,MARTENS J,DAHL G,et al.On the importance of initialization and momentum in deep learning[C]∥International Conference on Machine Learning.Dasgupta andProceedings of the 30th International Conference on Machine Learning (ICML-13).2013:1139-1147.
[4]NAIR V,HINTON G E.Rectified Linear Units Improve Re-stricted Boltzmann Machines[C]∥International Conference on Machine Learning.Omnipress,2010:807-814.
[5]GLOROT X,BENGIO Y.Understanding the difficulty of trai-ning deep feedforward neural networks[J].Journal of Machine Learning Research,2010,9:249-256.
[6]DAHL G E,SAINATH T N,HINTON G E.Improving deep neural networks for LVCSR using rectified linear units and dropout[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:8609-8613.
[7]GLOROT X,BORDES A,BENGIO Y.Deep Sparse Rectifier Neural Networks[J].Journal of Machine Learning Research,2010,15:315-323.
[8]HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-223.
[9]LE Q V,JIQUAN N,ADAM C,et al.On optimization methods for deep learning[C]∥Proceedings of the 28th International Conference on International Conference on Machine Learning.2011:265-272.
[10]DUCHI J,HAZAN E,SINGER Y.Adaptive Subgradient Me-thods for Online Learning and Stochastic Optimization[J].Journal of Machine Learning Research,2010,12(7):257-269.
[11]ZEILER M D.ADADELTA:An Adaptive Learning Rate Me-thod[J].arXiv:1212.5701.
[12]TIELEMANT,HINTON G.RMSProp:Divide the gradient by a running average of its recent magnitude[R].COURSERA:Neural Networks for Machine Learning.2012.
[13]KINGMA D,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980.
[14]NESTEROV Y.A method of solving a convex programming problem with convergence rate O(1/k2)[J].Soviet Mathema-tics Doklady,1983(27):372-376.
[15] NESTEROV Y.Introductory lectures on convex optimization:a basic course[M].Applied Optimization Kluwer Academic,1998:119-120.
[16]GOODFELLOW I,BENGIO Y,COURVILLE A.深度学习[M].符天凡等,译.北京:人民邮电出版社,2017:240-249.
[17]FLETCHER R,POWELL M J D.A Rapidly Convergent De-scent Method for Minimization[J].Computer Journal,1963,6(6):163-168.
[18]MILLER S J.The Method of Least Squares[D].Brown University,2006:1-7.
[19]朱志鹏,喻芳,曾青霞,等.基于深度学习与偏最小二乘的分析方法及其医学应用[J].江西中医药大学报,2017,29(3):94-97.
[20]SEBER G A F,LEE A J.Linear regression analysis[M].Wiley,2012.
[21]CRISTIANINI N,SHAWER-TAYLOR J.An introduction to support vector machines and other kernel-based learning me-thods[M].Cambridge University Press,2000.
[22]KENNY V,NATHAL M,SALDANA S.Heuristic algorithms[C]∥ChE 345.Spring,2014.
[23]EISELT H A,SANDBLOM C L.Heuristic Algorithms[C]∥In-teger Programming and Network Models.2000:229-258.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[5] CHEN Jun, HE Qing, LI Shou-yu. Archimedes Optimization Algorithm Based on Adaptive Feedback Adjustment Factor [J]. Computer Science, 2022, 49(8): 237-246.
[6] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[7] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[8] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[9] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[10] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[11] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[12] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[13] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[14] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[15] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!