计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 348-355.doi: 10.11896/jsjkx.230300171
夏景旋, 申国伟, 郭春, 崔允贺
XIA Jingxuan, SHEN Guowei, GUO Chun, CUI Yunhe
摘要: 随着算力网络的快速发展,通用算力、人工智能算力、超算等算力资源分布广泛。算力资源协同服务是算力网络研究的关键问题。在算力资源协同过程中,一方面,算力网络面临海量终端算力服务的高并发请求和低时延响应需求;另一方面,其难以充分发挥数据中心算力资源的高吞吐和低时延优势,进而难以为用户提供高效的算力服务。针对上述挑战,提出一种基于用户态协议栈和远程直接内存访问(Remote Direct Memory Access,RDMA)的用户态代理系统(User-Space Proxy System,USPS),通过用户态协议栈响应客户高并发算力请求,在动态批处理策略协调下实现基于RDMA的数据中心算力高吞吐、低时延服务。在通信方面,USPS实现了一个高效的远程过程调用(Remote Procedure Call,RPC)通信机制,能够充分利用RDMA网卡带宽提供高速消息通信;在请求处理方面,提出了一个动态批处理调度方法,能够在满足用户时延要求的前提下最大化批处理效率。实验结果表明,USPS的服务响应时延仅是传统内核态Nginx代理系统的7.8%~23.1%,是其他用户态代理系统的17.3%~24.7%;吞吐量比传统内核态的Nginx代理系统提升了3.4~8.9倍,比其他用户态代理系统提升了3.2~4.2倍。
中图分类号:
[1]JIA Q M,HU Y J,ZHANG H Y,et al.Research on deterministic computing power network[J].Journal on Communications,2022,43(10):55-64. [2]ZHANG H K,YU C X,QUAN W,et al.Fundamental Research on Computing Integration Networking[J].Acta Electronica Si-nica,2022,50(12):2928-2934. [3]CHEN X Y,ZHANG X S,XIE Z L,et al.A Computing andTransmission Integrated Optimization Method for Cloud-Edge-End Computing First System[J].Journal of Computer Research and Development,2023(4):719-734. [4]ZHONG L J,WANG M.Blockchain-enabled Cooperative Resource Allocation Scheme for Computing First Networking[J].Journal of Computer Research and Development,2023,60(4):750-762. [5]TENCENT CLOUD.F-stack:An high performant networkframework based on DPDK[EB/OL].http://www.f-stack.org/. [6]INTEL.Data Plane Development Kit[EB/OL].http://dpdk.org. [7]JEONG E Y,WOO S,JAMSHED M,et al.mtcp:a highly scala-ble user-level TCP stack for multicore systems[C]//11th USENIX Symposium on Networked Systems Design and Implementation(NSDI 14).2014:489-502. [8]JAMSHED M A,MOON Y G,KIM D,et al.mOS:A reusable networking stack for flow monitoring middleboxes[C]//14th USENIX Symposium on Networked Systems Design and Implementation(NSDI 17).2017:113-129. [9]WANG S,LOU C,CHEN R,et al.Fast and Concurrent RDFQueries using RDMA-assisted GPU Graph Exploration[C]//2018 USENIX Annual TechnicalConference(USENIX ATC 18).2018:651-664. [10]XUE J,MIAO Y,CHEN C,et al.Fast distributed deep learning over rdma[C]//Proceedings of the Fourteenth EuroSys Conference 2019.2019:1-14. [11]ZHANG J,LU X,CHU C H,et al.C-GDR:High-Performance Container-aware GPUDirect MPI Communication Schemes on RDMA Networks[C]//2019 IEEE International Parallel and Distributed Processing Symposium(IPDPS).IEEE,2019:242-251. [12]ZHANG R,SHEN G,GONG L,et al.DSANA:A distributed machine learning acceleration solution based on dynamic scheduling and network acceleration[C]//2020 IEEE 22nd International Conference on High Performance Computing and Communications;IEEE 18th International Conference on Smart City;IEEE 6th International Conference on Data Science and Systems(HPCC/SmartCity/DSS).IEEE,2020:302-311. [13]DRAGOJEVIC' A,NARAYANAN D,CASTRO M,et al.FaRM:Fast remote memory[C]//11th {USENIX} Symposium on Networked Systems Design and Implementation({NSDI} 14).2014:401-414. [14]TSAI S Y,ZHANG Y.Lite kernel rdma support for datacenter applications[C]//Proceedings of the 26th Symposium on Ope-rating Systems Principles.2017:306-324. [15]CHEN Y,LU Y,SHU J.Scalable RDMA RPC on reliable connection with efficient resource sharing[C]//Proceedings of the Fourteenth EuroSys Conference 2019.2019:1-14. [16]MONGA S K,KASHYAP S,MIN C.Birds of a Feather Flock Together:Scaling RDMA RPCs with Flock[C]//Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles.2021:212-227. [17]JONATHAN C.Batch processing of network packets[DB/OL].https://lwn.net/Articles/763056/. [18]LANGE S,LINGUAGLOSSA L,GEISSLER S,et al.Discrete-time modeling ofnfv accelerators that exploit batched processing[C]//IEEE INFOCOM 2019-IEEE Conference on Computer Communications.IEEE,2019:64-72. [19]LINGUAGLOSSA L,LANGE S,PONTARELLI S,et al.Sur-vey of performance acceleration techniques for network function virtualization[J].Proceedings of the IEEE,2019,107(4):746-764. [20]LÉVAI T,NÉMETH F,RAGHAVAN B,et al.Batchy:Batch-scheduling data flow graphs with service-level objectives[C]//17th USENIX Symposium on Networked Systems Design and Implementation(NSDI 20).2020:633-649. [21]LI M Q.Research on cross-protocol user-space proxy technology for data center network[D].Guiyang:Guizhou University.2021. [22]WILL G.wrk:Modern HTTP benchmarking tool[EB/OL].https://github.com/wg/wrk. |
|