计算机科学 ›› 2013, Vol. 40 ›› Issue (3): 104-106.

• 2012多值逻辑专栏 • 上一篇    下一篇

基于InfiniBand网络的消息可扩展技术研究

彭龙根,尤洪涛,尹万旺   

  1. (国家并行计算机工程技术研究中心 北京100080)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Research of Message Scalable Technology over InfiniBand Network

  • Online:2018-11-16 Published:2018-11-16

摘要: InfiniBand是目前HPC系统互连的主流网络之一,其提供的可靠连接传输服务因为支持RDMA、原子操作等功能而被广泛应用于MPI等并行应用编程模型。但是支撑可靠连接所需的消息队列及缓冲区开销往往会随着并行规模的扩大而急剧增加,从而制约了应用规模的扩大。为了解决这种内存开销带来的消息可扩展性问题,先从InfiniBand传输优化方面介绍了共享接收队列和扩展可靠连接技术,然后基于并行通信模型提出了分组连接技术。通过这些技术可以将节点内存开销减少2个数量级,并且开销不会随并行规模的扩大而明显增加。

关键词: 可扩展,共享接收队列,分组连接,InfiniBand

Abstract: InfiniBand is one of the most promising network interconnecting technologies in HPC. Its reliable connection services support RDMA, atomic operations, etc, and are widely employed in MPI and other similar parallel programming models. However, the cost of message queues and buffers for reliable connections rises dramatically when the scale of parallelism increases. As a result, it becomes the bottleneck for InfiniBand in large scale applications. ho solve the problem, this paper provided shared receiving queue and extended reliable connection, then brought forward the group connection technology based parallel communication mode. On hand, the memory cost in the computing node is cut down for at least 100 times smaller. And the most amazing thing is that the cost of memory will be relatively constant when the scale of parallelism is upgraded.

Key words: Scalable, Shared receive queue, Group connection, InfiniBand

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!