Computer Science ›› 2018, Vol. 45 ›› Issue (11A): 527-531.

• Interdiscipline & Application • Previous Articles     Next Articles

Design and Implementation of Distributed TensorFlow Platform Based onKubernetes

YU Chang-fa, CHEN Xue-lin, YANG Xiao-hu   

  1. School of Software Technology,Zhejiang University,Hangzhou 310027,China
  • Online:2019-02-26 Published:2019-02-26

Abstract: This paper designed and implemented a distributed deep learning platform based on Kubernetes.In order to solve the propblems of complex environment configuration of distributed TensorFlow,uneven distribution of underlying physical resources,low efficiency of training model and long development cycle,a method of containerized TensorFlow based on Kubernetes was proposed.By combining the advantages of Kubernetes and TensorFlow,Kubernetes provides a stable and reliable computing environment and gives full play to the advantages of heterogeneous TensorFlow,which greatly reduces the difficulty in large-scale use.Meanwhile,an agile management platform is established,which realizes the fast distribution of distributed TensorFlow resources,one key deployment,second level running,dynamic expansion,efficient training and so on.

Key words: Deep learning, Docker, Kubernetes, TensorFlow

CLC Number: 

  • TP311
[1]ABADI M,AGARWAL A,BARHAM P,et al.TensorFlow:Large-Scale Machine Learning on Heterogeneous Distributed System[J].arXiv:1603.04467v2,2016.
[2]龚正,吴治辉,王伟,等.Kubernetes权威指南:从Docker到Kubernetes实践全接触(纪念版)[M].北京:电子工业版社,2017:1-42.
[3]浙江大学SEL实验室.Docker容器与容器云[M].北京:人民邮电出版社,2016:1-27.
[4]李航.统计学习方法 [M].北京:清华大学出版社,2012:1-24.
[5]李嘉璇.TensorFlow技术解析与实战[M].北京:人民邮电出版社,2017:218-224.
[6]PEINL R,HOLZSCHUHER A F,PFITZER F.Docker Cluster Management for the Cloud-Survey Results and Own Solution[J].Grid Computing,2016,14:265-282.
[7]Serving a TensorFlow Model[EB/OL].https://www.tensorflow.org/serving/serving_basic.
[8]go-restful[EB/OL].https://github.com/emicklei/go-restful.
[9]CHANG F,DEAN J,GHEMAWAT S,et al.Gruber.Bigtable:A Distributed Storage System for Structured Data[J].ACM Transactions on Computer Systems (TOCS),2008,26(2):1-26.
[10]朱林.Elasticsearch技术解析与实战[M].北京:机械工业出版社,2017:6-10.
[11]https://github.com/kubernetes/examples/blob/master/staging/volumes/glusterfs/README.md.
[12]https://github.com/heketi/heketi.
[13]https://en.wikipedia.org/wiki/Network_File_System.
[14]SEYMOUR K,NAKADA H,MATSUOKA S,et al.Overview of GridRPC:A Remote Procedure Call API for Grid Computing[J].Grid Computing,2002,2536:274-278.
[15]http://yann.lecun.com/exdb/mnist.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[14] SUN Fu-quan, CUI Zhi-qing, ZOU Peng, ZHANG Kun. Brain Tumor Segmentation Algorithm Based on Multi-scale Features [J]. Computer Science, 2022, 49(6A): 12-16.
[15] KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!