计算机科学 ›› 2018, Vol. 45 ›› Issue (10): 291-294.doi: 10.11896/j.issn.1002-137X.2018.10.054

• 交叉与前沿 • 上一篇    下一篇

基于GPU的分子动力学模拟Cell Verlet算法实现及其并行性能分析

张帅1,2, 徐顺1,3, 刘倩1,3, 金钟1,3   

  1. 中国科学院计算机网络信息中心 北京100190 1
    中国科学院大学 北京100049 2
    中国科学院计算科学应用研究中心 北京100190 3
  • 收稿日期:2017-12-16 出版日期:2018-11-05 发布日期:2018-11-05
  • 作者简介:张 帅(1990-),男,硕士生,主要研究方向为GPU并行优化;徐 顺(1980-),男,博士,CCF会员,主要研究方向为高性能计算与分子模拟,E-mail:xushun@sccas.cn;刘 倩(1981-),女,博士,副研究员,主要研究方向为高性能计算;金 钟(1974-),男,博士,研究员,CCF会员,主要研究方向为高性能计算化学,E-mail:zjin@sccas.cn(通信作者)。
  • 基金资助:
    国家重点研发计划:高能物理高性能计算应用软件系统规模化应用(2017YFB0203203),中国科学院科研信息化工程项目:生物医药计算模拟软件(XXH13506-404),中国科学院青年创新促进会基金(会员号2016156)资助

Cell Verlet Algorithm of Molecular Dynamics Simulation Based on GPU and Its Parallel Performance Analysis

ZHANG Shuai1,2, XU Shun1,3, LIU Qian1,3, JIN Zhong1,3   

  1. Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China 1
    University of Chinese Academy of Sciences,Beijing 100049,China 2
    Center of Scientific Computing Applications & Research,Chinese Academy of Sciences,Beijing 100190,China 3
  • Received:2017-12-16 Online:2018-11-05 Published:2018-11-05

摘要: 分子动力学模拟存在空间和时间的复杂性,并行加速分子的模拟过程尤为重要。基于GPU硬件数据并行架构的特点,组合分子动力学模拟的原子划分和空间划分的并行策略,优化实现了短程作用力计算Cell Verlet算法,并对分子动力学核心基础算法的GPU实现做了优化和性能分析。Cell Verlet算法实现首先采用原子划分的方式,将每个粒子的模拟计算任务映射到每个GPU线程,并采用空间划分的方式将模拟区域进行元胞划分,建立元胞索引表,实现粒子在模拟空间的实时定位;而在计算粒子间的作用力时,引入希尔伯特空间填充曲线方法来保持数据的线性存储与数据的三维空间分布的局部相关性,以便通过缓存加速GPU的全局内存访问;也利用了访存地址对齐和块内共享等技术来优化设计GPU分子动力学模拟过程。实例测试与对比分析显示,当前的算法实现具有强可扩展性和加速比等优势。

关键词: Cell Verlet算法, GPU异构计算, 访存局部性, 分子动力学, 互斥同步优化

Abstract: Molecular dynamics simulation has complexity in spatial and temporal scale,therefore it is critical to optimize the simulation process of molecular dynamics.Based on the characteristics of data parallel on GPU hardware architecture,this paper combined atomic partition and spatial partition of molecular dynamics simulation and optimized the Cell Verlet algorithm for the short-range force calculation,and also designed the core and basic algorithms of molecular dynamics on GPU with optimization and performance analysis.The implementation of Cell Verlet algorithm in this paper is started with atomic partition,each task of particle simulation is mapped to each GPU thread,and then the simulated space is divided into cellular area in the spatial partition way.The cellular index table is established,and the real-time locating simulation particles are realized.Meanwhile,in the force calculation between particles,Hilbert’s space-filling curve algorithm was introduced to keep the partial correlation of particle’s space layout in line with linear data storage,in order to cache and accelerate the access on GPU global memory.This paper also used the memory address alignment and block sharing memory technology to optimize the design of GPU molecular dynamics simulation process.Testing and comparative analysis of examples show that the current implementation of algorithm has advantages in high parallelism and speedup.

Key words: Cell Verlet algorithm, GPU heterogeneous computing, Locality of memory access, Molecular dynamics, Mutex and synchronization optimization

中图分类号: 

  • TP338.6
[1]SCHLICK T,COLLEPARDOGUEVARA R,HALVORSEN L A,et al.Biomolecular modeling and simulation:a field coming of age[J].Quarterly Reviews of Biophysics,2011,44(2):191.
[2]YUKO O.Molecular simulations in generalised ensemble[J]. Molecular Simulation,2012,38(14-15):1282-1296.
[3]BROWN W M,WANG P,PLIMPTON S J,et al.Implementing molecular dynamics on hybrid high performance computers-short range forces[J].Computer Physics Communications,2011,182(4):898-911.
[4]ABRAHAM M J,MURTOLA T,SCHULZ R,et al.GRO- MACS:High performance molecular simulations through multi-level parallelism from laptops to supercomputers[J].Softwarex,2015,1-2(C):19-25.
[5]VERLET L.Computer “Experiments” on Classical Fluids.I. Thermodynamical Properties of Lennard-Jones Molecules[J].Health Physics,1967,22(1):79-85.
[6]MATTSON W,RICE B M.Near-neighbor calculations using a modified cell-linked list method[J].Computer Physics Communications,1999,119(2-3):135-148.
[7]GLASER J,NGUYEN T D,ANDERSON J A,et al.Strong scaling of general-purpose molecular dynamics simulations on GPUs[J].Computer Physics Communications,2015,192:97-107.
[8]FEI H,ZHANG Y Q,WANG K,et al.Parallel Algorithm and Implementation for Molecular Dynamics Simulation Based on GPU [J].Computer Science,2011,38(9):276-278.(in Chinese)
费辉,张云泉,王可,等.基于GPU的分子动力学模拟并行化及实现[J].计算机科学,2011,38(9):276-278.
[9]CHEN F G,GE W,LI J H.Implementation of Complex Multiphase Flow Molecular Dynamics Simulation on GPU [J].Chinese Science (Series B:Chemistry),2008,38(12):1120-1128.(in Chinese)
陈飞国,葛蔚,李静海.复杂多相流动分子动力学模拟在GPU上的实现[J].中国科学:B辑,2008,38(12):1120-1128.
[10]PLIMPTON S.Fast parallel algorithms for short-range molecular dynamics[J].Journal of Computational Physics,1995,117(1):1-19.
[11]PLIMPTON S,HENDRICKSON B.A new parallel method for molecular dynamics simulation of macromolecular systems[J].Journal of Computational Chemistry,1994,17(3):326-337.
[12]GREST G S,DNWEG B,KREMER K.Vectorized link cell Fortran code for molecular dynamics simulations for a large number of particles[J].Computer Physics Communications,1989,55(3):269-285.
[13]HILBERT D.ber die stetige Abbildungeiner Linie auf ein Flchenstück[M]∥Dritter Band:Analysis·Grundlagen der Mathematik·Physik Verschiedenes.Berlin:Springer,1935,38(3):459-460.
[14]BUTZ A R.Alternative Algorithm for Hilbert’s Space-Filling Curve[J].IEEE Computer Society,1971,20(4):424-426.
[1] 金明灿,胡长军,李建江,苗庆松.
基于改进细胞链表算法的分子动力学模拟性能优化模型
Optimization Model for Performance of Molecular Dynamics Simulation Based on Modified Cell-linked List Method
计算机科学, 2013, 40(2): 12-15.
[2] 费辉,张云泉,王可,许亚武.
基于GPU的分子动力学模拟并行化及实现
Parallel Algorithm and Implementation for Molecular Dynamics Simulation Based on GPU
计算机科学, 2011, 38(9): 276-278.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!