计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 81-88.doi: 10.11896/jsjkx.210600179

• 高性能计算 • 上一篇    下一篇

面向粒子输运程序加速的体系结构设计

傅思清, 黎铁军, 张建民   

  1. 国防科技大学计算机学院 长沙 410073
  • 收稿日期:2021-06-22 修回日期:2022-02-15 出版日期:2022-06-15 发布日期:2022-06-08
  • 通讯作者: 黎铁军(tjli@nudt.edu.cn)
  • 作者简介:(fusiqingnudt@nudt.edu.cn)
  • 基金资助:
    国家重点研发计划(2018YFB0204301)

Architecture Design for Particle Transport Code Acceleration

FU Si-qing, LI Tie-jun, ZHANG Jian-min   

  1. School of Computer,National University of Defense Technology,Changsha 410073,China
  • Received:2021-06-22 Revised:2022-02-15 Online:2022-06-15 Published:2022-06-08
  • About author:FU Si-qing,born in 1996,postgraduate.His main research interests include computer architecture and high performance computing.
    LI Tie-jun,born in 1977,Ph.D,resear-cher,Ph.D supervisor,is a member of China Computer Federation.His main research interests include high perfor-mance computing and so on.
  • Supported by:
    National Key Research and Development Program of China(2018YFB0204301).

摘要: 粒子输运的随机模拟方法通常用于求解大量运动状态中粒子的特征量。粒子输运问题广泛出现在医学、天体物理和核物理领域,当前粒子输运随机模拟求解方法的主要挑战是计算机能够支撑的模拟样本数、模拟时间尺度与研究人员研究实际问题的需求之间的差距。处理器性能的发展随着工艺尺寸进步的停滞进入了新的历史阶段,复杂的片上结构的集成已经不符合现今的要求。面向粒子输运程序,文中开展了一系列体系结构设计工作,通过分析和利用程序的并行性和访存特点,设计了精简内核和可重配置缓存来加速程序。通过模拟器验证,文中提出的体系结构相比传统乱序架构获得了4.45倍性能功耗比优势以及2.78倍性能面积比优势,这为进一步研究大规模众核粒子输运加速器奠定了基础。

关键词: 加速器, 粒子输运, 流水线, 蒙特卡洛, 体系结构

Abstract: The stochastic simulation method of particle transport is usually used to solve the characteristic quantity of a large number of moving particles.Particle transport problems are widely found in the fields of medicine,astrophysics and nuclear phy-sics.The main challenge of current stochastic simulation methods for particle transport is the gap between the number of simulation samples supported by computers,the simulation timescale,and researchers’ needs to study practical problems.Since the development of processor performance has entered a new historical stage with the stagnation of process size progress,the integration of complex on-chip structures no longer meets the current requirements.For particle transport programs,this paper carries out a series of architecture design works.By analyzing and using the parallelism and access characteristics of the program,simplified kernel and reconfigurable cache are designed to speed up the program.Experiments show that compared to the traditional architecture composed of multiple out-of-order cores,this architecture can obtain more than 4.5x in performance per watt and 2.78x in performance per area,which lays a foundation for the further study of large-scale many-nucleus particle transport acce-lerator.

Key words: Accelerator, Architecture, Monte Carlo, Particle transport, Pipeline

中图分类号: 

  • TP302
[1] YANI S,BUDIANSAH I,RHANI M F,et al.Monte CarloModel and Output Factors of Elekta InfinityTM 6 and 10 MV Photon Beam[J].Reports of Practical Oncology and Radiothe-rapy,2020,25(4):470-478.
[2] FRIDMAN E,HUO X K.Dynamic simulation of the CEFR control rod drop experiments with the Monte Carlo code Serpent[J].Annals of Nuclear Energy,2020,148:107707-107719.
[3] MAGDZIARZ P,ZDZIARSKI A A.Angle-dependent Compton reflection of X-rays and gamma-rays[J].Monthly Notices of the Royal Astronomical Society,1995,273(3):837-848.
[4] TYAGI N,BOSE A,CHETTY I J.Implementation of the DPM Monte Carlo code on a parallel architecture for treatment planning applications[J].Medical physics,2004,31(9):2721-2725.
[5] WAGNER J C,HAGHIGHAT A J.Parallel MCNP Monte Carlo transport calculations with MPI[J].Transactions of the American Nuclear Society,1996,75(9):338-339.
[6] ALGUACIL J,SAUVAN P,JUAREZ R,et al.Assessment and optimization of MCNP memory management for detailed geometry of nuclear fusion facilities[J].Fusion Engineering and Design,2018,136:386-389.
[7] ANDERSON J A,JANKOWSKI E,GRUBB T L,et al.Massively parallel Monte Carlo for many-particle simulations on GPUs[J].Journal of Computational Physics,2013,254:27-38.
[8] LUU J,REDMOND K,LO W,et al.FPGA-based Monte Carlo computation of light absorption for photodynamic cancer therapy[C]//2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.IEEE,2009:157-164.
[9] WHITTON K,HU X S,CEDRIC X Y,et al.An FPGA solution for radiation dose calculation[C]//2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.IEEE,2006:227-236.
[10] GOKHALE M,FRIGO J,AHRENS C,et al.Monte carlo radiative heat transfer simulation on a reconfigurable computer[C]//International Conference on Field Programmable Logic and Applications.Springer,2004:95-104.
[11] PEPER F.The End of Moore’s Law:Opportunities for Natural Computing?[J].New Generation Computing,2017,35(3):253-269.
[12] WANG Y,BRUN E,MALVAGI F,et al.Competing EnergyLookup Algorithms in Monte Carlo Neutron Transport Calculations and Their Optimization on CPU and Intel MIC Architectures[J].Procedia Computer Science,2016,80:484-495.
[13] LUJAN P,HALYO V,HUNT A,et al.GPU Enhancement of the Trigger to Extend Physics Reach at the Large Hadron Collider[J].Journal of Instrumentation,2013,8(10):14214-14247.
[14] AOYAMA T,ISHIKAWA K I,KIMURA Y,et al.First Application of Lattice QCD to Pezy-SC Processor[J].Procedia Computer Science,2016,80:1418-1427.
[15] KOWALSKI M A,COSGROVE P M.Acceleration of surface tracking in Monte Carlo transport via distance caching[J].Annals of Nuclear Energy,2021,152:108002.
[16] BRUGGER C,DE SCHRYVER C,WEHN N.Hyper:A runtime reconfigurable architecture for monte carlo option pricing in the heston model[C]//2014 24th International Conference on Field Programmable Logic and Applications(FPL).IEEE,2014:1-8.
[17] ZHANG S,WANG Z,PENG Y,et al.Mapping of option pricing algorithms onto heterogeneous many-core architectures[J].The Journal of Supercomputing,2017,73(9):3715-3737.
[18] LI B,LIU J.Heterogeneous cooperative coputing of particletransport based on Monte Carlo method on the tianhe 2A system[J].Computer Engineering and Science,2020,42(11):1922-1928.
[19] RICHARDS D F,BLEILE R C,BRANTLEY P S,et al.Quicksilver:a proxy app for the Monte Carlo transport code mercury[C]//2017 IEEE International Conference on Cluster Computing(CLUSTER).IEEE,2017:866-873.
[20] GOORLEY J T,JAMES M R,BOOTH T E,et al.Initial MCNP6 release overview-MCNP6 version 1.0[R/OL].Los Alamos National Lab.(LANL),Los Alamos,NM(United States).https://doi.org/10.2172/1086758.
[21] ROMANO P K,HORELIK N E,HERMAN B R,et al.OpenMC:A state-of-the-art Monte Carlo code for research and deve-lopment[C]//SNA+MC 2013-Joint International Conference on Supercomputing in Nuclear Applications+Monte Carlo.EDP Sciences,2014:92-97.
[22] LEISERSON C E,THOMPSON N C,EMER J S,et al.There’s plenty of room at the Top:What will drive computer perfor-mance after Moore's law?[J/OL].Science,2020,368(6495).https://www.science.org/doi/10.1126/science.aam9744.
[23] BINKERT N,BECKMANN B,BLACK G,et al.The gem5simulator[J].ACM SIGARCH Computer Architecture News,2011,39(2):1-7.
[24] LI S,AHN J H,STRONG R D,et al.The McPAT Framework for Multicore and Manycore Architectures:Simultaneously Modeling Power,Area,and Timing[J].ACM Transactions on Architecture and Code Optimization,2013,10(1):1-29.
[25] SOHN K,YUN W J,OH R,et al.A 1.2 V 20 nm 307 GB/s HBM DRAM with at-speed wafer-level IO test scheme and adaptive refresh considering temperature distribution[J].IEEE Journal of Solid-State Circuits,2016,52(1):250-260.
[26] ENDO F A,COUROUSSÉ D,CHARLES H P.Micro-architectural simulation of embedded core heterogeneity with gem5 and mcpat[C]//Proceedings of the 2015 Workshop on Rapid Simulation and Performance Evaluation:Methods and Tools.2015:1-6.
[1] 罗俊仁, 张万鹏, 陆丽娜, 陈璟.
即时策略博弈在线对抗规划方法综述
Survey on Online Adversarial Planning for Real-time Strategy Game
计算机科学, 2022, 49(6): 287-296. https://doi.org/10.11896/jsjkx.210600168
[2] 敖天宇, 刘全.
一种快速收敛的最大置信上界探索方法
Upper Confidence Bound Exploration with Fast Convergence
计算机科学, 2022, 49(1): 298-305. https://doi.org/10.11896/jsjkx.201100194
[3] 蒋慧敏, 蒋哲远.
企业云服务体系结构的参考模型与开发方法
Reference Model and Development Methodology for Enterprise Cloud Service Architecture
计算机科学, 2021, 48(2): 13-22. https://doi.org/10.11896/jsjkx.200300044
[4] 张元鸣, 虞家睿, 蒋建波, 陆佳炜, 肖刚.
面向MapReduce的中间数据传输流水线优化机制
Intermediate Data Transmission Pipeline Optimization Mechanism for MapReduce Framework
计算机科学, 2021, 48(2): 41-46. https://doi.org/10.11896/jsjkx.191000103
[5] 谢景明, 胡伟方, 韩林, 赵荣彩, 荆丽娜.
基于“嵩山”超级计算机系统的量子傅里叶变换模拟
Quantum Fourier Transform Simulation Based on “Songshan” Supercomputer System
计算机科学, 2021, 48(12): 36-42. https://doi.org/10.11896/jsjkx.201200023
[6] 陈国良, 张玉杰.
并行计算学科发展历程
Development of Parallel Computing Subject
计算机科学, 2020, 47(8): 1-4. https://doi.org/10.11896/jsjkx.200600027
[7] 王国澎, 杨剑新, 尹飞, 蒋生健.
负载均衡的处理器运算资源分配方法
Computing Resources Allocation with Load Balance in Modern Processor
计算机科学, 2020, 47(8): 41-48. https://doi.org/10.11896/jsjkx.191000148
[8] 王栋, 商红慧, 张云泉, 李琨, 贺新福, 贾丽霞.
原子动力学蒙特卡洛程序MISA-KMC在反应堆压力容器钢辐照损伤研究中的应用
Application of Atomic Dynamics Monte Carlo Program MISA-KMC in Study of Irradiation Damage of Reactor Pressure Vessel Steel
计算机科学, 2020, 47(4): 30-35. https://doi.org/10.11896/jsjkx.191100045
[9] 钟林辉, 扶丽娟, 叶海涛, 齐杰, 徐静.
软件演化历史的逆向工程生成方法研究
Study on Reverse Engineering Generation Method of Software Evolution History
计算机科学, 2020, 47(11A): 549-556. https://doi.org/10.11896/jsjkx.200200067
[10] 李远锋, 李章维, 秦子豪, 胡俊, 张贵军.
基于蒙特卡洛相似度遗传算法的运输问题研究
Study on Transportation Problem Using Monte Carlo Similarity Based Genetic Algorithm
计算机科学, 2020, 47(10): 215-221. https://doi.org/10.11896/jsjkx.190600101
[11] 陈晓杰,周清雷,李斌.
基于FPGA的7-Zip加密文档高能效口令恢复方法
Energy-efficient Password Recovery Method for 7-Zip Document Based on FPGA
计算机科学, 2020, 47(1): 321-328. https://doi.org/10.11896/jsjkx.190100027
[12] 罗飞, 任强, 丁炜超, 卢海峰.
基于最小松弛量的启发式一维装箱算法
Heuristic One-dimensional Bin Packing Algorithm Based on Minimum Slack
计算机科学, 2019, 46(9): 315-320. https://doi.org/10.11896/j.issn.1002-137X.2019.09.048
[13] 童晓红,唐超.
基于次优区间卡尔曼滤波的机器鱼跟踪方法
Robotic Fish Tracking Method Based on Suboptimal Interval Kalman Filter
计算机科学, 2018, 45(2): 114-120. https://doi.org/10.11896/j.issn.1002-137X.2018.02.020
[14] 张绮曼, 张颖.
无线传感器网络中蒙特卡洛定位算法的研究
Study on Monte Carlo Location Algorithm in Wireless Sensor Networks
计算机科学, 2018, 45(12): 77-80. https://doi.org/10.11896/j.issn.1002-137X.2018.12.011
[15] 黄星河, 李艾静, 王海.
DTN体系结构及关键技术研究综述
Survey of DTN Architecture and Key Technologies
计算机科学, 2018, 45(12): 19-23. https://doi.org/10.11896/j.issn.1002-137X.2018.12.003
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!