计算机科学 ›› 2023, Vol. 50 ›› Issue (6): 58-65.doi: 10.11896/jsjkx.230200213

• 高性能计算 • 上一篇    下一篇

基于国产c86处理器的CP2K软件移植与优化

范黎林1,2,3, 乔一航1,2,3, 李俊飞4, 柴旭清1,2,3, 崔容培1,2,3, 韩秉豫2,3,5   

  1. 1 河南师范大学计算机与信息工程学院 河南 新乡 453007
    2 河南师范大学高性能计算中心 河南 新乡 453007
    3 智慧商务与物联网技术河南省工程实验室 河南 新乡 453007
    4 中国科学院大学计算机科学与技术学院 北京 100049
    5 河南师范大学软件学院 河南 新乡 453007
  • 收稿日期:2023-02-27 修回日期:2023-04-11 出版日期:2023-06-15 发布日期:2023-06-06
  • 通讯作者: 柴旭清(cxq@htu.edu.cn)
  • 作者简介:(fanlilin@htu.edu.cn)
  • 基金资助:
    光合基金B类(20210702202107022768,20210702202107022686);河南省高等教育教学改革研究与实践立项项目 (2021SJGLX354);中国高校产学研创新基金-新一代信息技术创新项目资助课题计划书 (2020ITA07040);产学合作协同育人项目(202102089014,202102533043)

CP2K Software Porting and Optimization Based on Domestic c86 Processor

FAN Lilin1,2,3, QIAO Yihang1,2,3, LI Junfei4, CHAI Xuqing1,2,3, CUI Rongpei1,2,3, HAN Bingyu2,3,5   

  1. 1 School of Computer and Information Engineering,Henan Normal University,Xinxiang,Henan 453007,China
    2 High Performance Computing Center,Henan Normal University,Xinxiang,Henan 453007,China
    3 Engineering Laboratory of Intelligent Business and Internet of Things Technology,Xinxiang,Henan 453007,China
    4 School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China
    5 College of Software,Henan Normal University,Xinxiang,Henan 453007,China
  • Received:2023-02-27 Revised:2023-04-11 Online:2023-06-15 Published:2023-06-06
  • About author:FAN Lilin,born in 1970,Ph.D,associate professor, master tutor. His main research interests include intelligent business and intelligent transportation.CHAI Xuqing,born in 1982,senior engineer,master tutor.Her main research interests include high-performance computing and machine learning.
  • Supported by:
    Guanghe Fund Class B(20210702202107022768,20210702202107022686),Henan Province Higher Education Teaching Reform Research and Practice Project(2021SJGLX354),China University Industry-University-Research Innovation Fund-New Generation Information Technology Innovation Project Funding Project Plan(2020ITA07040) and Industry-University Cooperation and Education Project(202102089014,202102533043).

摘要: CP2K是目前运行最快的开源第一性原理材料计算和模拟软件,源码中调用协处理器的部分基于CUDA架构编写。因平台底层硬件架构和编译环境不同,原生的CP2K软件无法调用国产c86处理器平台上的DCU,因此不能实现跨平台应用。为解决该问题,提出了一种CP2K面向该平台的移植方案。该方案的核心思想为:对CP2K软件中主要基于CUDA接口实现的DBCSR库进行代码分析,拆解对应结构体和类的封装方式,并基于HIP的编程标准对其进行实现和封装。在国产c86处理器平台上编译安装HIP版的DBCSR库,链接CP2K软件,最终实现运行DCU版的CP2K软件。后续选取两个测试算例,基于编译级与运行级对其进行优化实验。实验发现,删除CP2K脚本链自动安装的FFTW库可提高计算结果精度。实验结果表明,所使用的优化方法可显著提升CP2K软件的计算效率和计算准确性,为实现开源软件面向国产平台的移植优化和国产化替代做出贡献。

关键词: CP2K, DBCSR, 编译优化, MPI运行优化, HIP移植, JIT编译

Abstract: CP2K is currently the fastest open source first-principles materials calculation and simulation software,and the part of the source code that calls the coprocessor is written based on the CUDA architecture.Due to the different underlying hardware architecture and compilation environment of the platform,the native CP2K software cannot call the DCU on the domestic c86 processor platform to achieve cross-platform applications.In order to solve this problem,a CP2K porting scheme for this platform is proposed.The core idea is to analyze the code of the DBCSR library mainly based on the CUDA interface in CP2K software,disassemble the encapsulation method of the corresponding structure and class,and implement and package it based on the programming standard of HIP.The DBCSR library of HIP version is compiled and installed on the domestic c86 processor platform,and the CP2K software is linked to finally realize the CP2K software running the DCU version.Then,two test studies are selected and optimized based on the compilation level and run-level.It is found that removing the FFTW library automatically installed by CP2K script chain can improve the accuracy of calculation results.Experimental results show that the optimized method used can significantly improve the computational efficiency and calculation accuracy of CP2K software,and contribute to the porting optimization and localization of open source software for domestic platforms.

Key words: CP2K, DBCSR, Compilation optimization, MPI running optimization, HIP transplantation, Just-in-time compilation

中图分类号: 

  • TP391
[1]TIAN Z,CHEN Y F.Performance Optimization of Molecular Dynamics Simulation on the Light of Sunway Taihu Lake[J].Journal of Software,2021,32(9):2945-2962.
[2]YAO W J.Kamui· Implementation and Optimization of Mole-cular Dynamics Software on Taihu Light[D].Hefei:University of Science and Technology of China,2017.
[3]MENG D L,WEN M H,WEI J W,et al.Porting and Optimization of OpenFOAM on the Light of Sunway Taihu Lake[J].Computer Science,2017,44(10):64-70.
[4]VANDEVONDELE J,BORSTNIK U,HUTTER J.Linear Sca-ling Self-consistent Field Calculations with Millions of Atoms in the Condensed Phase[J].Journal of Chemical Theory and Computation,2012,8(10):3565-3573.
[5]SU N Y.Analysis on the Development Model of China's Supercomputing Technology Catch-up and Overtake[J].Journal of National University of Defense Technology,2021,43(3):86-97.
[6]YAN B H,AN H,LIANG W H,et al.Implementation and Optimization of Discrete Execution Model for I/O Intensive Applications[J].Journal of Chinese Computer Systems,2019,40(12):2619-2623.
[7]ZHANG Y C X,FENG L B,LIANG J G.Research on Parallel Performance Tuning of Sea Ice Model Using Sugon Supercomputer[J].Software Guide,2021,20(6):80-85.
[8]KÜHNE T D,IANNUZZI M,DEL BEN M,et al.Cp2k:AnElectronic Structure and Molecular Dynamics Software Package-quickstep:Efficient and Accurate Electronic Structure alculations[J].The Journal of Chemical Physics,2020,152(19):194103.
[9]MONDAL A,GAULTOIS M W,PELL A J,et al.Large-scale Computation of Nuclear Magnetic Resonance Shifts for Paramagnetic Solids Using CP2K[J].Journal of Chemical Theory and Computation,2018,14(1):377-394.
[10]QI X F,ZHANG X H,LI J Z,et al.Molecular Dynamics Simulation Study of NC/NG Blended System[J].Journal of Ordnance Industry,2013,34(1):93-99.
[11]WEN Y H,ZHU R Z,ZHOU F X,et al.Main Techniques for Molecular Dynamics Simulation[J].Progress in Mechanics,2003(1):65-73.
[12]WU N,CUI D D,JI B C,et al.Design of Innovative Experimental Project in Undergraduate Enzyme Engineering Based on GROMACS Software*-Molecular Dynamics Simulation of Microbial Glutaminase in Aqueous Solution[J].Chemical Education,2021,42(20):102-107.
[13]YAO W J,CHEN J S,SU Z C,et al.Migration and Optimization of NAMDSoftware Based on Sunway Taihu Light[J].Computer Engineering and Science,2017,39(6):1022-1030.
[14]PENG L,CHEN J S,AN H.AMBER Software Migration and Optimization Based on Sunway Taihu Light[J].Computer Engineering,2020,46(12):12-20.
[15]STONE J E,HYNNINEN A P,PHILLIPS J C,et al.Early Experiences Porting the Namd and Vmd Molecular Simulation and Analysis Software to Gpu-Accelerated Openpower platforms[C]//International Conference on High Performance Computing.2016:188-206.
[16]NIE N,CHANGJUN H,YUNQUAN Z,et al.Comparison of Large-scale Molecular Dynamics Software for Materials Micro-stucture Evolution[J].Journal of Frontiers of Computer Science &Technology,2017,11(3):355.
[17]HUTTER J,IANNUZZI M,SCHIFFMANN F,et al.CP2K:Atomistic Simulations of Condensed Matter Systems[J].Wiley Interdisciplinary Reviews:Computational Molecular Science,2014,4(1):15-25.
[18]SARAVANAN K,BASDOGAN Y,DEAN J,et al.Computational investigation of CO2 electroreduction on tin oxide and predictions of Ti,V,Nb and Zr dopants for improved catalysis[J].Journal of Materials Chemistry A,2017,5(23):11756-11763.
[19]BRÜCK S,CALDERARA M,BANI-HASHEMIAN M H,et al.Towards Ab-initio Simulations of Nanowire Field-effect Transistors[C]//2014 International Workshop on Computational Electronics(IWCE).IEEE,2014:1-3.
[20]MONES L,JONES A,GÖTZ A W,et al.TheAdaptive Buffered Force QM/MM Method in the CP2K and Amber Software Pa-ckages[J].Journal of computational chemistry,2015,36(9):633-648.
[21]KRIZEK E A R.Computational Study of Radiation-induced Defects in Topaz[D].London:University College London,2015.
[22]BETHUNE I,REID F,LAZZARO A.CP2K Performance from Cray XT3 to XC30[J/OL].https://cug.org/proceedings/cug2014_proceedings/includes/files/pap127-file2.pdf.
[23]REID F,BETHUNE I.Evaluating CP2K on Exascale Hard-ware:Intel XeonPhi[J/OL].https://hgpu.org/?p=12283.
[24]ZHANG Y Z,CAO W D,BU J D,et al.Migration and Optimization of GROMACS 2020 on ROCm Platform[J].Computer Engineering and Science,2021,43(11):1901.
[25]SUN G H,HU X,WANG R.OpenCV Porting and Optimization Based on FT-M7002[C]//Proceedings of the 22nd Annual Conference on Computer Engineering and Technology and the 8th Microprocessor Technology Forum.2018.
[26]WANG Y C,LIN X H,CAI L J,et al.Using OpenACC to Port and Optimize GTC-P on Taihu Light[J].Journal of Computer Research and Development,2018,55(4):875-884.
[27]CAI J.Structural Analysis and Back-end Porting Practice ofGCC Compilation System[D].Hangzhou:Zhejiang University,2004.
[28]ZHENG W X.Research on MPI Runtime Parameter TuningMethod and Key Technology in Parallel Computer System[D].Changsha:National University of Defense Technology,2019.
[29]XIE S W,HUA B.Communication Optimization of MPI/RDMA Library in Virtual Machine Environment[J].Journal of Chinese Computer Systems,2021,42(7):1505-1510.
[30]CHAI X Q,DONG Y L,LI J F.Profit-oriented Task Scheduling Algorithm in Hadoop Cluster[J].EURASIP Journal on Embedded Systems,2016,2016 (1):1-8.
[31]CHAI X Q,DONG Y L,LI J F.An Algorithm for Improved Si-milarity and Collaborative Filtering in Social Networks[J].IJSSST(International Journal of Simulation:Systems,Science and Technology),2016,17(22):12.1-12.6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!