Computer Science ›› 2019, Vol. 46 ›› Issue (7): 180-185.doi: 10.11896/j.issn.1002-137X.2019.07.028

• Artificial Intelligence • Previous Articles     Next Articles

Distributed Convolutional Neural Networks Based on Approximate Newton-type Mothod

WANG Ya-hui1,2,LIU Bo3,YUAN Xiao-tong1,2   

  1. (Department of Information and Control,Nanjing University of Information Science and Technology,Nanjing 210044,China)1
    (Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing 210044,China)2
    (Department of Computer Science,Rutgers University,New Jersey 08854,USA)3
  • Received:2018-08-02 Online:2019-07-15 Published:2019-07-15

Abstract: Most machine learning problems can ultimately be attributed to optimization problems (model learning).It mainly uses mathematics methods to study the optimal ways and solutions for various problems and plays an increasingly important role in scientific computing and engineering analysis.With the rapid development of deep networks,the scale of data and parameters also increases.Although significant advances have been made in GPU hardware,network architecture and training methods in recent years,it is still difficult for a single computer to efficiently train deep network models on large data sets.The distributed approximation Newton-type method is one of the effective methods to solve this problem.It is introduced into the study of distributed neural networks.Distributed approximation Newton-type method distributes the average sample evenly across multiple computers,the amount of data to be processed by each computer is reduced,and computers communicate with each other to complete the training task.This paper proposed distributed deep learning based on Approximation Newton-type method.The DANE algorithm is used to train in the same network.As the number of GPUs increasesexponentially by 2,the training time decreases exponentially by nearly 2.This is consistent with ultimate goal,that is,on the premise of ensuring the estimation accuracy,the existing distributed framework is used to implement the approximate Newton-like algorithm,and the algorithm is used to train the neural network in a distributed manner to improve the operating efficiency.

Key words: Approximate Newton-typemethod, Distributed framework, Neural network, Optimization problem

CLC Number: 

  • TP181
[1]GANDHI A,THOTA S,DUBE P,et al.Autoscaling for Hadoop Clusters[C]∥IEEE International Conference on Cloud Engineering.IEEE,2016:109-118.
[2]YUAN Y,SALMI M F,YIN H,et al.Spark-GPU:An accele- rated in-memory data processing engine on clusters[C]∥IEEE International Conference on Big Data.IEEE,2017:273-283.
[3]SAMADDAR S,SINHA R,DE R K.A MODEL for DISTRIBUTED PROCESSING and ANALYSES of NGS DATA under MAP-REDUCE PARADIGM[J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2018,PP(99):1.
[4]NASR M M,SHAABAN E M,HAFEZ A M.Building Sentiment analysis Model using Graphlab[J].International Journal of Scientific & Engineering Research,2017,8(6):1155-1160.
[5]JIANG J,CUI B,ZHANG C,et al.Heterogeneity-aware Distri- buted Parameter Servers[C]∥ACM International Conference.ACM,2017:463-478.
[6]CHEN T,LI M,LI Y,et al.Mxnet:A flexible and efficient machine learning library for heterogeneous distributed systems[J].arXiv preprint arXiv:1512.01274,2015.
[7]ZINKEVICH M,WEIMER M,SMOLA A J,et al.ParallelizedStochastic Gradient Descent[C]∥Advances in Neural Information Processing Systems 23,Conference on Neural Information Processing Systems 2010.DBLP,2010:2595-2603.
[8]ZHANG Y,DUCHI J C,WAINWRIGHT M J.Communication-efficient algorithms for statistical optimization[C]∥Internatio-nal Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1502-1510.
[9]GUPTA S,ZHANG W,WANG F.Model Accuracy and Run- time Tradeoff in Distributed Deep Learning:A Systematic Study[C]∥IEEE,International Conference on Data Mining.IEEE,2017:171-180.
[10]SHALEV-SHWARTZ S,SHAMIR O,SREBRO N,et al.Sto- chastic convex optimization[C]∥Annual Conference on Learning Theory.2009.
[11]SRIDHARAN K,SHALEV-SHWARTZ S,SREBRO N.Fast rates for regularized objectives[C]∥Advances in Neural Information Processing Systems.2009:1545-1552.
[12]NAJAFABADI M M,KHOSHGOFTAAR T M,VILLANUS- TRE F,et al.Large-scale distributed L-BFGS[J].Journal of Big Data,2017,4(1):22.
[13]ERSEGHE T.Distributed Optimal Power Flow Using ADMM[J].IEEE Transactions on Power Systems,2014,29(5):2370-2380.
[14]TAYLOR G,BURMEISTER R,XU Z,et al.Training neural networks without gradients:a scalable ADMM approach[C]∥International Conference on International Conference on Machine Learning.JMLR.org,2016:2722-2731.
[15]WANG Y,YIN W,ZENG J.Global convergence of ADMM in nonconvex nonsmooth optimization[J].Journal of Scientific Computing,2015(1-2):1-35.
[16]FENG X,CHANG L,LIN X,et al.Distributed computing connected components with linear communication cost[J].Distributed and Parallel Databases,2018,36(3):555-592.
[17]SHAMIRO,SREBRO N,ZHANG T.Communication-efficient distributed optimization using an approximate Newton-type method[C]∥International Confe-rence on International Confe-rence on Machine Learning.JMLR.org,2014:II-1000.
[18]ZHANG Y,WAINWRIGHT M J,DUCHI J C.Communication-efficient algorithms for statistical optimization[C]∥Advances in Neural Information Processing Systems.2012:1502-1510.
[19]LI M.Scaling Distributed Machine Learning with the Parameter Server[C]∥International Conference on Big Data Science and Computing.ACM,2014:3.
[20]CHAUDHARI P,BALDASSI C,ZECCHINA R,et al.Parle: parallelizing stochastic gradient descent[J].arXiv preprint ar-Xiv:1707.00424,2017.
[1] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[2] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[5] WANG Run-an, ZOU Zhao-nian. Query Performance Prediction Based on Physical Operation-level Models [J]. Computer Science, 2022, 49(8): 49-55.
[6] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[7] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9] QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[10] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[11] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[12] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[13] LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[14] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[15] PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!