Computer Science ›› 2021, Vol. 48 ›› Issue (2): 47-54.doi: 10.11896/jsjkx.200800187

• New Distributed Computing Technologies and Systems • Previous Articles     Next Articles

Convolutional Optimization Algorithm Based on Distributed Coding

YUAN Chen-yu, XIE Zai-peng, ZHU Xiao-rui, QU Zhi-hao, XU Yuan-yuan   

  1. School of Computer and Information,Hohai University,Nanjing 211100,China
  • Received:2020-08-28 Revised:2020-11-07 Online:2021-02-15 Published:2021-02-04
  • About author:YUAN Chen-yu,born in 1998,postgra-duate.His main research interests include distributed computing and so on.
    XIE Zai-peng,born in 1982,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include distributed systems,embedded system and Internet of Things.
  • Supported by:
    The National Key R&D Program of China(2016YFC0402710) and Key Project of National Nature Science Foundation of China(61832005).

Abstract: Convolution operation plays a vital role in statistics,signal processing,image processing and deep learning.It is also an important operation in deepneural networks where it is the basis of information filter and characteristics extraction.The exploration of methods to speed up the convolutional operations has become an open research topic in recent years.Many studies have pointed out that distributed computing framework may improve the computational time of convolution operations and hence optimize the training efficiency for deep learning.But stragglers in distributed systems may slow down the overall system.This paper proposes a distributed coding based Winograd algorithm for implementing 2D convolution operations.In this algorithm,the Winograd algorithm can effectively accelerate the speed of 2D convolution calculation,while distributed encoding can mitigate the impact of stragglers by using a redundancy-based encoding strategy.Therefore,the proposed distributed 2D convolution algorithm can effectively mitigate the straggler problem in distributed systems while improving the 2D convolution calculation,hence it may effectively improve the computational efficiency of distributed 2D convolution algorithms.

Key words: Convolution, Distributed coding, Distributed computing, Winograd

CLC Number: 

  • TP338.8
[1] PARHI K K.VLSI digital signal processing systems:design and implementation[M].John Wiley & Sons,2007.
[2] LIU S,DU Z,TAO J,et al.Cambricon:An instruction set architecture for neural networks[C]//2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).IEEE,2016:393-405.
[3] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[4] LI Y,ZHANG Z.Parallel computing:review and perspective[C]//2018 5th International Conference on Information Science and Control Engineering (ICISCE).IEEE,2018:365-369.
[5] LI S,MADDAH-ALI M A,YU Q,et al.A fundamental tradeoffbetween computation and communication in distributed computing[J].IEEE Transactions on Information Theory,2017,64(1):109-128.
[6] LI S,MADDAH-ALI M A,AVESTIMEHRA S.Compressedcoded distributed computing[C]//2018 IEEE International Symposium on Information Theory (ISIT).IEEE,2018:2032-2036.
[7] ANDERSON A,GREGG D.Optimal DNN primitive selectionwith partitioned boolean quadratic programming[C]//Procee-dings of the 2018 International Symposium on Code Generation and Optimization.2018:340-351.
[8] MATHIEU M,HENAFF M,LECUN Y.Fast training of convolutional networks through ffts[J].arXiv:1312.5851,2013.
[9] VASILACHE N,JOHNSON J,MATHIEU M,et al.Fast convolutional nets with fbfft:A GPU performance evaluation[J].arXiv:1412.7580,2014.
[10] CONG J,XIAO B.Minimizing computation in convolutionalneural networks[C]//International Conference on Artificial Neural Networks.Springer,Cham,2014:281-290.
[11] STRASSEN V.Gaussian elimination is not optimal[J].Numerische mathematik,1969,13(4):354-356.
[12] LAVIN A,GRAY S.Fast algorithms for convolutional neural networks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2016:4013-4021.
[13] WINOGRAD S.Arithmetic complexity of computations[M].Siam,1980.
[14] WANG Z,LAN Q,HE H,et al.Winograd algorithm for 3D convolution neural networks[C]//International Conference on Artificial Neural Networks.Springer,Cham,2017:609-616.
[15] ZHAO Y,WANG D,WANG L.Convolution accelerator designs using fast algorithms[J].Algorithms,2019,12(5):112.
[16] FERNANDEZ-MARQUES J,WHATMOUGH P N,MUNDY A,et al.Searching for Winograd-aware Quantized Networks[J].arXiv:2002.10711,2020.
[17] ZAHARIA M,CHOWDHURY M,FRANKLIN M J,et al.Spark:Cluster computing with working sets[C]//HotCloud'10 Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing.2010:4-10.
[18] DEAN J,GHEMAWAT S.MapReduce:simplified data proces-sing on large clusters[J].Communications of the ACM,2008,51(1):107-113.
[19] DEAN J,BARROSOL A.The tail at scale[J].Communications of the ACM,2013,56(2):74-80.
[20] AGARWAL A,DUCHIJ C.Distributed delayed stochastic optimization[C]//Advances in Neural Information Processing Systems.2011:873-881.
[21] RECHT B,RE C,WRIGHT S,et al.Hogwild!:A lock-free approach to parallelizing stochastic gradient descent[J].Advances in neural information processing systems,2011,24:693-701.
[22] ANANTHANARAYANAN G,GHODSI A,SHENKER S,et al.Effective straggler mitigation:Attack of the clones[C]//Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation.2013:185-198.
[23] SHAH N B,LEE K,RAMCHANDRAN K.When do redundant requests reduce latency?[J].IEEE Transactions on Communications,2015,64(2):715-722.
[24] WANG D,JOSHI G,WORNELL G.Efficient task replication for fast response times in parallel computation[C]//The 2014 ACM International Conference on Measurement and Modeling of Computer Systems.2014:599-600.
[25] GARDNER K,ZBARSKY S,DOROUDI S,et al.Reducing latency via redundant requests:Exact analysis[J].ACM SIGMETRICS Performance Evaluation Review,2015,43(1):347-360.
[26] CHAUBEY M,SAULE E.Replicated data placement for uncertain scheduling[C]//2015 IEEE International Parallel and Distributed Processing Symposium Workshop.IEEE,2015:464-472.
[27] LEE K,PEDARSANI R,RAMCHANDRAN K.On schedulingredundant requests with cancellation overheads[J].IEEE/ACM Transactions on Networking,2016,25(2):1279-1290.
[28] JOSHI G,SOLJANIN E,WORNELL G.Efficient redundancytechniques for latency reduction in cloud systems[J].ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS),2017,2(2):1-30.
[29] JOSHI G,LIU Y,SOLJANIN E.On the delay-storage trade-off in content download from coded distributed storage systems[J].IEEE Journal on Selected Areas in Communications,2014,32(5):989-997.
[30] LEE K,LAM M,PEDARSANI R,et al.Speeding up distributed machine learning using codes[J].IEEE Transactions on Information Theory,2017,64(3):1514-1529.
[31] REISIZADEH A,PRAKASH S,PEDARSANI R,et al.Codedcomputation over heterogeneous clusters[J].IEEE Transactions on Information Theory,2019,65(7):4227-4242.
[32] DIMAKIS A G,GODFREY P B,WU Y,et al.Network coding for distributed storage systems[J].IEEE transactions on information theory,2010,56(9):4539-4551.
[33] YU Q,MADDAH-ALI M,AVESTIMEHR S.Polynomial codes:an optimal design for high-dimensional coded matrix multiplication[C]//Advances in Neural Information Processing Systems.2017:4403-4413.
[34] PEI D,SALOMAA A,DING C.Chinese remainder theorem:applications in computing,coding,cryptography[M].World Scientific,1996.
[1] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2] LI Zong-min, ZHANG Yu-peng, LIU Yu-jie, LI Hua. Deformable Graph Convolutional Networks Based Point Cloud Representation Learning [J]. Computer Science, 2022, 49(8): 273-278.
[3] WANG Xin-tong, WANG Xuan, SUN Zhi-xin. Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network [J]. Computer Science, 2022, 49(8): 314-322.
[4] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[5] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[6] WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[7] TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[8] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[9] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[10] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[11] LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[12] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[13] SUN Fu-quan, CUI Zhi-qing, ZOU Peng, ZHANG Kun. Brain Tumor Segmentation Algorithm Based on Multi-scale Features [J]. Computer Science, 2022, 49(6A): 12-16.
[14] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[15] LI Jian-zhi, WANG Hong-ling, WANG Zhong-qing. Automatic Generation of Patent Summarization Based on Graph Convolution Network [J]. Computer Science, 2022, 49(6A): 172-177.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!