计算机科学 ›› 2021, Vol. 48 ›› Issue (3): 246-258.doi: 10.11896/jsjkx.201100038

• 计算机网络 • 上一篇    下一篇

可重构数据中心网络研究综述

张登科1, 王兴伟1, 何强2, 曾荣飞3, 易波1   

  1. 1 东北大学计算机科学与工程学院 沈阳110819
    2 东北大学医学与生物信息工程学院 沈阳110819
    3 东北大学软件学院 沈阳110819
  • 收稿日期:2020-11-04 修回日期:2021-01-27 出版日期:2021-03-15 发布日期:2021-03-05
  • 通讯作者: 王兴伟(wangxw@mail.neu.edu.cn)
  • 作者简介:zhangdk@mail.neu.edu.cn
  • 基金资助:
    国家自然科学基金(61872073,61572123)

State-of-the-art Survey on Reconfigurable Data Center Networks

ZHANG Deng-ke1, WANG Xing-wei1, HE Qiang2, ZENG Rong-fei3, YI bo1   

  1. 1 College of Computer Science and Engineering,Northeastern University,Shenyang 110819,China
    2 College of Medicine and Biological Information Engineering,Northeastern University,Shenyang 110819,China
    3 College of Software,Northeastern University,Shenyang 110819,China
  • Received:2020-11-04 Revised:2021-01-27 Online:2021-03-15 Published:2021-03-05
  • About author:ZHANG Deng-ke,born in 1981,Ph.D candidate.His main research interests include network design and optimization of hyperscale data center and spectral graph theory.
    WANG Xing-wei,born in 1968,Ph.D,professor,Ph.D supervisor.His main research interests include NGI,mobile wireless Internet,IP/DWDM optical Internet,etc.
  • Supported by:
    National Natural Science Foundation of China (61872073,61572123).

摘要: 超大规模数据中心成为数字社会的关键基础设施。用户端应用的激增使得数据中心网络(Data Center Networks,DCNs)的东西向流量呈指数级增长,同时端应用的多样化也导致了严重的流量倾斜问题。此外,后摩尔时代的到来和Dennard缩放的失效使得数据中心网络设备容量的增速趋缓。数据中心网络面临用户激增、流量倾斜和CMOS性能墙等多重压力。为解决上述问题,可重构数据中心网络(Reconfigurable Data Center Networks,RDCNs)应运而生。文中首先介绍RDCNs的5个研究驱动力,重点概述了两类物理层使能技术;其次,详细阐述RDCNs研究分类和链路重构、层重构以及拓扑重构这三大设计空间关键技术的研究现状;然后,简述RDCNs理论的研究进展;最后,展望未来研究方向并总结全文。

关键词: 层重构, 可重构数据中心网络, 链路重构, 使能技术, 拓扑重构

Abstract: Hyper-scale data centers have become the key infrastructure in digital society.The prosperity of user applications has caused exponential growth of east-west traffic in the Data Center Networks (DCNs),and simultaneously,the diversification of user applications has led to serious traffic skew problems.On the other hand,the growth of network equipment capacity becomes slow in the post-Moore era,with the breakdown of Dennard scaling.Reconfigurable Data Center Networks (RDCNs) emerge when data centers are facing the pressures from the surge of users and the skew traffic as well as the CMOS performance wall.Firstly,this paper presents five motivations of RDCNs.Then,two types of enabling physical technologies for RDCNs are summarized and the research status of RDCNs is elaborated in detail in terms of the three-design space,i.e.,link-level reconfiguration,layer-level reconfiguration and topology-level reconfiguration.In addition,theoretical research results of RDCNs are introduced.Finally,the future work is presented and the whole paper is concluded.

Key words: Enabling technologies, Layer-level reconfiguration, Link-level reconfiguration, Reconfigurable data center networks, Topology-level reconfiguration

中图分类号: 

  • TP393
[1]China launches ‘New Infrastructure’ an innovation-driven program [EB/OL].https://news.cgtn.com/ news/2020-03-17/China-launches-New-Infrastructure-a-innovation-driven-program-OVo9mgDvGg/index.Html.
[2]CHOI S,BURKOV B,ECKERTA,et al.FBOSS:buildingswitch software at scale[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Budapest,Hungary,2018:342-356.
[3]SINGH A,ONG J,AGARWAL A,et al.Jupiter rising:A decade of clos topologies and centralized control in Google's datacenter network[J].ACM Computer Communication Review,2015,45(4):183-197.
[4]HUAWEI,Global Connectivity Index[EB/OL].https://www.huawei.com/minisite/gci/en.
[5]Hyperscale Data Centers Market Insights 2019,Global and Chinese Analysis and Forecast to 2024[EB/OL].https://www.grandresearchstore.com/services/global-hyperscale-data-centers-2019-2024-542.
[6]Google,Cloud TPU [EB/OL].https://cloud.google.com/tpu.
[7]NVIDIA A100 Tensor Core GPU Architecture [EB/OL].https://www.nvidia.com/content/dam/en-zz/ Solutions/Data-Center/nvidia-ampere-architecture whitepaper.pdf.
[8]HENNESSY,JOHN L,PATTERSON D A.A New Golden Age for Computer Architecture[J].Communications of the ACM,2019,62(2):48-60.
[9]MOORE S K.Another step toward the end of Moore’s law[J].IEEE Spectrum,2019,56(6):9-10.
[10]CHOWDHURY M,ZHONG Y,STOICA I.Efficient coflowscheduling with Varys[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Chicago,Illinois,USA,2014.
[11]BALLANI H,COSTA P,BEHRENDT,et al.Sirius:A Flat Datacenter Network with Nanosecond Optical Switching[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).New York,USA,2020:782-797.
[12]ZHANG D K,WANG X W,HUANG M.2-Tier:Data center network architecture with symbiotic traffic-skew mitigated capability and evolutionary deployment[J].Journal of Southeast University (Natural Science Edition),2020,50(2):402-408.
[13]XIA Y,SCHLANSKER M,NGT S E,et al.Enabling Topological Flexibility for Data Centers Using Omniswitch[C]//Proceedings of the 7th USENIX Conference on Hot Topics in Cloud Computing.Santa Clara,CA,2015.
[14]VANINI E,PAN R,ALIZADEH M,et al.Let it flow:Resilient asymmetric load balancing with flowlet switching[C]//14th USENIX Symposium on Networked Systems Design and Implementation.Boston,USA,2017:407-420.
[15]KANDULA S,PADHYE J,BAHL V.Flyways to de-congest data center networks[C]//8th ACM Workshop on SIGCOMM logo Hot Topics in Networks.New York,USA,2009:1-6.
[16]GHOBADI M,MAHAJAN R,PHANISHAYEE A,et al.Projector:Agile reconfigurable data center interconnect[C]//Confe-rence on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Florianopolis,Brazil,2016:216-299.
[17]QIN Y,GUO D,TANG G,et al.TIO:A VLC-Enabled Hybrid Data Center Network Architecture[J].Tsinghua Science & Technology,2019,24(4):484-96.
[18]WANG M,CUI Y,XIAO S,et al.Neural Network Meets DCN:Traffic-driven Topology Adaptation with Deep Learning[C]//Proceedings of the ACM on Measurement and Analysis of Computing Systems.Irvine,USA,2018.
[19]ZHOU X,ZHANG Z,ZHU Y,et al.Mirror mirror on the cei-ling:exible wireless links for data centers[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Helsinki,Finland,2012.
[20]XIA Y,SUN X S,DZINAMARIRA S,et al.A Tale of Two Topologies:Exploring Convertible Data Center Network Architectures with Flat-Tree[C]//Conference on Applications,Techno-logies,Architectures,and Protocols for Computer Communication (SIGCOMM).Los Angeles,USA,2017:295-308.
[21]MELLETTE W M,MCGUINNESS R,ROY A,et al.RotorNet:A Scalable,Low-Complexity,Optical Datacenter Network[C]//Proceedings of the Conference of the ACM Special Interest Group on Data Communication.2017:267-80.
[22]SALMAN S,STREIFFER C,CHEN H,et al.DeepConf:Automating Data Center Network Topologies Management with Machine Learning[C]//Proceedings of the 2018 Workshop on Network Meets AI & ML.Budapest,Hungary,2018:8-14.
[23]GREENBERG A,HAMILTON J R,JAIN N,et al.VL2:AScalable and Flexible Data Center Network[J].Communications of the ACM,2011,54(3):95-104.
[24]GUO C,LU G,LI D,et al.BCube:A High Performance,Server-Centric Network Architecture for Modular Data Centers[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Barcelona,Spain,2009:63-74.
[25]Broadcom.25.6 Tb/s StrataXGS Tomahawk 4 Ethernet Switch Series [EB/OL].https://www.broadcom.com/products/ether-net-connectivity/switching/strataxgs/bcm56990-series.2020.
[26]CALABRETTA N,MIAO W,MEKONNEN K,et al.Monolithi-cally Integrated WDM Cross-Connect Switch for High-Performance Optical Data Center Networks[C]//2017 Optical Fiber Communications Conference and Exhibition (OFC).Los Angeles,USA,2017:1-3.
[27]XIA W,ZHAO P,WEN Y,et.al.A survey on data center networking (DCN):infrastructure and operations[J].IEEE Communications Surveys and Tutorials,2017,19(1):640-656.
[28]CELIK A,SHIHADA B,ALOUINI M S.Optical Wireless Data Center Networks:Potentials,Limitations,and Prospects[J].Broadband Access Communication Technologies XIII.2019,10945:109450I.
[29]ZHUO D,GHOBADI M,MAHAJAN R,et al.Understanding and mitigating packet corruption in data center networks[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Los Angeles,USA,2017.
[30]ZOU S,WEN X,CHEN K,et al.Virtualknotter:Online virtual machine shuffling for congestion resolving in virtualized datacenter[J].Computer Networks,2014,67:141-153.
[31]HAMEDAZIMI N,QAZI Z,GUPTA H,et al.FireFly:AReconfigurable Wireless Data Center Fabric Using Free-Space Optics[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).London,UK,2015.
[32]LUO L,GUO D,WU J,et al.Vlccube:A vlc enabled hybrid network structure for data centers[J].IEEE Transactions on Parallel and Distributed Systems,2016,28(7):2088-2102.
[33]NIKOLOVA D,RUMLEY S,CALHOUND,et al.Scaling sili-con photonic switch fabrics for data center interconnection networks[J].Opt.Express.205,23(2):1159-1175.
[34]S320 optical circuit switch [EB/OL].http://www.calient.net/members-area/?redirect-to=/download/s320-optical-circuit-swi-tch-datasheet,2019.
[35]Fujitsu’s high speed MEMS mirror switch[J].III-Vs Review.2003,16(8):30.
[36]KONG Q,HUANG S,ZHOU Y,et al.On the performance of a scalable optical switching architecture for flat intercluster data center network with centralized control[J].Optical Enginee-ring,2014,53(7):075101.
[37]ZERVAS G S,DE LEENHEER M,SADEGHIOON L,et al.Multi-granular optical cross-connect:Design,analysis,and de-monstration[J].IEEE/OSA Journal of Optical Communications and Networking,2009,1(1):69-84.
[38]FIORANI M,CASONI M,ALEKSIC S.Performance and power consumption analysis of a hybrid optical core node[J].IEEE/OSA Journal of Optical Communications and Networking,2011,3(6):502-513.
[39]AW E T,WONFOR A,GLICK M,et al.Large Dynamic Range 32x32 Optimized non-blocking SOA based Switch for 2.56 Tb/sInterconnect Applications[C]//Optical Communication.VDE,2007.
[40]CHENG Q,DING M,WONFOR A,et al.The feasibility ofbuilding a 64x64 port count SOA-based optical switch[C]//International Conference on Photonics in Switching (PS).Flo-rence,Italy,2015:199-201.
[41]SHU R,CHENG P,CHEN G,et al.Direct universal access:Making data center resources available to FPGA[C]//16th USENIX Symposium on Networked Systems Design and Implementation.Boston,USA,2019:127-140.
[42]AGUINALDO R,FORENCICH A,DEROSEC,et al.Energy-efficient,digitally-driven “fat pipe” silicon photonic circuit switch in the UCSD MORDIA data-center network[C]//Lasers &Electro-optics.IEEE,2014.
[43]TESTA F,PAVESI L.Optical Switching in Next GenerationData Centers[M].Springer,2018.
[44]OpenFlow protocol API Reference [EB/OL].https://github.com/open-rpa/openflow,2020.
[45]BOSSHART P,DALY D,GIBB G,et al.P4:Programming Protocol-independent Packet Processors[C]//ACM Computer Communication Review,2014,44(3):87-95.
[46]RMON Configuration Guide [EB/OL].https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/rmon/configuration/xe-16/rmon-xe-16-book/Configuring_RMON_Support.html.
[47]JIANG C,HAN G,LIN J,et al.Characteristics of Co-Allocated Online Services and Batch Jobs in Internet Data Centers:A Case Study from Alibaba Cloud[J].IEEE Access,2019,7:22495-22508.
[48]SUN X,ANSARI N,WANG R.Optimizing Resource Utilization of a Data Center[J].IEEE Communications Surveys and Tuto-rials,2016,18(4):2822-2846.
[49]AI at Scale - Microsoft Research [EB/OL].https://www.microsoft.com/en-us/research/project/ai-at-scale/.
[50]Facebook.Reinventing Facebook’s data center network [EB/OL].https://code.fb.com/data-center-engineering/f16-minipack.
[51]WHITTED W H,AIGNER G.Modular data center:U.S.Patent 7,278,273[P].2007-10-9.https://patents.google.com/patent/US7278273B1/en.
[52]BAO J,DONG D,ZHAO B,et.al.Flycast:Free-space optics accelerating multicast communications in physical layer[J].ACM Computer Communication Review,2015,45(5):97-98.
[53]HAMZA A,DEOGUN J,ALEXANDER D.Evolution of datacenters:A critical analysis of standards and challenges for FSO links[C]//2015 IEEE Conference on Standards for Communications & Networking.Tokyo,Japan,2015:100-105.
[54]ALGHADHBAN A.F4Tele:FSO for Data Center NetworkManagement and Packet Telemetry[J].arXiv:2006.07419,2020.
[55]WANG G,ANDERSEN D G,KAMINSKY M,et al.C-Th-rough:Part-Time Optics in Data Centers[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Chicago,USA,2010:327-338.
[56]FARRINGTON N,PORTER G,RADHAKRISHNAN S,et al. Helios:a hybrid electrical/optical switch architecture for modular data centers[J].ACM Computer Communication Review.2011,41(4):339-350.
[57]BAZZAZ H H,TEWARI M,WANG G,et al.Switching the Optical Divide:Fundamental Challenges for Hybrid Electrical/Optical Datacenter Networks[C]//Proceedings of the 2nd ACM Symposium on Cloud Computing (SOCC’11).ACM,2011.
[58]WEN K,SAMADI P,RUMLEY S,et al.Flexfly:Enabling aReconfigurable Dragonfly through Silicon Photonics[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.Salt lake city,USA,2016.
[59]CHATZIELEFTHERIOU A,LEGTCHENKO S,WILLIAMSH,et al.Larry:Practical network reconfigurability in the data center[C]//15th USENIX Symposium on Networked Systems Design and Implementation.Renton,USA,2018:141-156.
[60]ZHAO S,WANG R,ZHOU J,et al.Minimal Rewiring:Efficient Live Expansion for Clos Data Center Networks[C]//16th USENIX Symposium on Networked Systems Design and Implementation.2019:221-234.
[61]WU D,WANG W,CHEN A,et al.Say No to Rack Boundaries:Towards A Reconfigurable Pod-Centric DCN Architecture[C]//Proceedings of the 2019 ACM Symposium on SDN Research.2019:112-118.
[62]FIETZ J,WHITLOCK S,IOANNIDISG,et al.VNToR:Net-work Virtualization at the Top-of-Rack Switch[C]//Procee-dings of the Seventh ACM Symposium on Cloud Computing.Hawaii,USA,2016,428-441.
[63]LUGONES D,KATRINIS K,THEODOROPOULOS G,et al.A Reconfigurable,Regular-Topology Cluster/Datacenter Network Using Commodity Optical Switches[J].Future Generation Computer Systems,2014,30:78-89.
[64]SINGLA A,SINGH A,RAMACHANDRAN K,et al.Proteus:A Topology Malleable Data Center Network[C]//Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks.New Delhi,India,2010:8.
[65]CHEN K,SINGLA A,SINGH A,et al.OSA:An optical swi-tching architecture for data center networks with unprecedented flexibility[J].IEEE/ACM Transactions on Networking,2013,22(2):498-511.
[66]TEH M Y,WU Z,BERGMAN K.Flexspander:AugmentingExpander Networks in High-Performance Systems with Optical Bandwidth Steering[J].IEEE OSA Journal of Optical Communications and Networking,2020,12(4):B44-B54.
[67]VALIANT L G.A Scheme for Fast Parallel Communication[J].SIAM J,1982,11(2):350-361.
[68]MEYERSON A,TAGIKU B.Minimizing average shortest path distances via shortcut edge addition[C]//Proceeding of APPROX/RANDOM.Berlin,Germany,2009:272-285.
[69]EDMONDS J.Paths,trees and flowers[J].Canadian Journal of Mathematics,1965,17:449-467.
[70]SINGLA A,SINGH A,RAMACHANDRAN K,et al.Proteus:a topology malleable data center network[C]//Conference on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).New Delhi,India,2010.
[71]ZHANG M,MYSORE R N,SUPITTAYAPORNPONG S,et al.Understanding Lifecycle Management Complexity of Datacenter Topologies[C]//16th USENIX Symposium on Networked Systems Design and Implementation.Boston,USA,2019:235-254.
[72]PAPAGELIS M,BONCHI F,GIONIS A.Suggesting ghostedges for a smaller world[C]//Proceedings of the 20th ACM International Conference on Information and Knowledge Management.ACM,2011.
[73]FOERSTER K T,GHOBADI M,SCHMID S.Characterizing the algorithmic complexity of reconfigurable data center architectures[C]//ACM/IEEE Symposium on Architectures for Networking and Communications Systems.New York,USA,2018.
[74]FOERSTER K T,PACUT M,SCHMID S.On the complexity of non-segregated routing in reconfigurable data center architectures[J].ACM Computer Communication Review.2019,49(2):2-8.
[75]FENZ T,FOERSTER K,SCHMID S,et al.Efficient non-segregated routing for reconfigurable demand-aware networks[C]//18th IFIP Networking Conference.Warsaw,Poland,2019.
[76]GOEL A,KAPRALOV M,KHANNAS.Perfect matchings ino(n log n) time in regular bipartite graphs[J].SIAM J.,2013,42(3):1392-1404.
[77]LIU H,MUKERJEE M K,LI C,et al.Scheduling Techniques for Hybrid Circuit/Packet Networks[C]//Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies.Heidelberg,Germany,2015.
[78]LI X,HAMDI M.On scheduling optical packet switches with reconfiguration delay[J].IEEE Journal on Selected Areas in Communications,2003,21(7):1156-1164.
[79]BOJJA V S,ALIZADEH M,VISWANATH P.Costly circuits,submodular schedules and approximate caratheodory theorems[J].Queueing System,2018,88(3/4):311-347.
[80]GALE D,SHAPLEY L S.College admissions and the stability of marriage[J].The American Mathematical Monthly,1962,69(1):9-15.
[81]SINGLA A,HONG C Y,POPAL,et al.Jellyfish:Networking data centers randomly[C]//9th USENIX Symposium on Networked Systems Design and Implementation.Lombard,USA,2012:225-238.
[82]VALADARSKY A,SHAHAF G,DINITZ M,et al.Xpander:Towards optimal-performance datacenters[C]//12th International Conference on Emerging Networking Experiments and Technologies Irvine.USA,2016:205-219.
[83]KASSING S,VALADARSKY A,SHAHAFG,et al.Beyond fat-trees without antennae,mirrors,and disco-balls[C]//Confe-rence on Applications,Technologies,Architectures,and Protocols for Computer Communication (SIGCOMM).Los Angeles,USA,2017:281-294.
[84]MELLETTE W M,DAS R,GUO Y,et al.Expanding across time to deliver bandwidth efficiency and low latency[J].CoRR,abs/1903.12307,2019.
[85]AVIN C,HERCULES A,LOUKAS A,et al.RDAN:Toward Robust Demand-Aware Network Designs[J].Information Processing Letters,2018,133:5-9.
[86]AVIN C,MONDAL K,SCHMID S.Demand-aware network designs of bounded degree[J].arXiv:1705.06024,2017.
[87]AVIN C,SCHMID S.Toward demand-aware networking:A theo-ry for self-adjusting networks[J].ACM Computer Communication Review,2018,48(5):31-40.
[88]CHEN K,WEN X,MA X,et al.Toward A scalable,fault-tole-rant,high-performance optical data center architecture[J].IEEE/ACM Transaction on Networking,2017,25(4):2281-2294.
[89]KATTA N,GHAG A,HIRA M,et al.Clove:Congestion-Aware Load Balancing at the Virtual Edge[C]//Proceedings of the 13th International Conference on Emerging Networking Experiments and Technologies.2017:323-335.
[90]ALACHIOTIS N,ANDRONIKAKIS A,PAPADAKIS O,et al.DReDBox:Materializing a Full-Stack Rack-Scale System Prototype of a next-Generation Disaggregated Datacenter[C]//2018 Design,Automation & Test in Europe Conference & Exhibition.2018:1093-1098.
[91]AI at Scale-Microsoft Research [EB/OL].https://www.mi-crosoft.com/en-us/research/project/ai-at-scale.
[92]LUGONES D,KATRINIS K,COLLIER M.Are configurableoptical/electrical interconnect architecture for large-scalec lusters and datacenters[C]//Proceedings of the 9th Conference on Computing Frontiers.Cagliari,Italy,2012.
[93]CLARK K A,CLETHEROE D,GERARD T,et al.Synchronous subnanosecond clock and data recovery for optically switched data centres using clock phase caching[J].Nature Electronics,2020,3(7):426-433.
[1] 吴亚兰,武继刚,姜文超,刘竹松.
面向通讯同步的多处理器阵列重构
Reconfiguring Multiprocessor Arrays for Synchronous Communication
计算机科学, 2017, 44(7): 47-56. https://doi.org/10.11896/j.issn.1002-137X.2017.07.009
[2] 陈晓慧 孙志峰 赵骅.
产品生命周期评价系统数据仓库的研究

计算机科学, 2005, 32(4): 187-189.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!