Computer Science ›› 2022, Vol. 49 ›› Issue (6): 81-88.doi: 10.11896/jsjkx.210600179

• High Performance Computing • Previous Articles     Next Articles

Architecture Design for Particle Transport Code Acceleration

FU Si-qing, LI Tie-jun, ZHANG Jian-min   

  1. School of Computer,National University of Defense Technology,Changsha 410073,China
  • Received:2021-06-22 Revised:2022-02-15 Online:2022-06-15 Published:2022-06-08
  • About author:FU Si-qing,born in 1996,postgraduate.His main research interests include computer architecture and high performance computing.
    LI Tie-jun,born in 1977,Ph.D,resear-cher,Ph.D supervisor,is a member of China Computer Federation.His main research interests include high perfor-mance computing and so on.
  • Supported by:
    National Key Research and Development Program of China(2018YFB0204301).

Abstract: The stochastic simulation method of particle transport is usually used to solve the characteristic quantity of a large number of moving particles.Particle transport problems are widely found in the fields of medicine,astrophysics and nuclear phy-sics.The main challenge of current stochastic simulation methods for particle transport is the gap between the number of simulation samples supported by computers,the simulation timescale,and researchers’ needs to study practical problems.Since the development of processor performance has entered a new historical stage with the stagnation of process size progress,the integration of complex on-chip structures no longer meets the current requirements.For particle transport programs,this paper carries out a series of architecture design works.By analyzing and using the parallelism and access characteristics of the program,simplified kernel and reconfigurable cache are designed to speed up the program.Experiments show that compared to the traditional architecture composed of multiple out-of-order cores,this architecture can obtain more than 4.5x in performance per watt and 2.78x in performance per area,which lays a foundation for the further study of large-scale many-nucleus particle transport acce-lerator.

Key words: Accelerator, Architecture, Monte Carlo, Particle transport, Pipeline

CLC Number: 

  • TP302
[1] YANI S,BUDIANSAH I,RHANI M F,et al.Monte CarloModel and Output Factors of Elekta InfinityTM 6 and 10 MV Photon Beam[J].Reports of Practical Oncology and Radiothe-rapy,2020,25(4):470-478.
[2] FRIDMAN E,HUO X K.Dynamic simulation of the CEFR control rod drop experiments with the Monte Carlo code Serpent[J].Annals of Nuclear Energy,2020,148:107707-107719.
[3] MAGDZIARZ P,ZDZIARSKI A A.Angle-dependent Compton reflection of X-rays and gamma-rays[J].Monthly Notices of the Royal Astronomical Society,1995,273(3):837-848.
[4] TYAGI N,BOSE A,CHETTY I J.Implementation of the DPM Monte Carlo code on a parallel architecture for treatment planning applications[J].Medical physics,2004,31(9):2721-2725.
[5] WAGNER J C,HAGHIGHAT A J.Parallel MCNP Monte Carlo transport calculations with MPI[J].Transactions of the American Nuclear Society,1996,75(9):338-339.
[6] ALGUACIL J,SAUVAN P,JUAREZ R,et al.Assessment and optimization of MCNP memory management for detailed geometry of nuclear fusion facilities[J].Fusion Engineering and Design,2018,136:386-389.
[7] ANDERSON J A,JANKOWSKI E,GRUBB T L,et al.Massively parallel Monte Carlo for many-particle simulations on GPUs[J].Journal of Computational Physics,2013,254:27-38.
[8] LUU J,REDMOND K,LO W,et al.FPGA-based Monte Carlo computation of light absorption for photodynamic cancer therapy[C]//2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.IEEE,2009:157-164.
[9] WHITTON K,HU X S,CEDRIC X Y,et al.An FPGA solution for radiation dose calculation[C]//2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.IEEE,2006:227-236.
[10] GOKHALE M,FRIGO J,AHRENS C,et al.Monte carlo radiative heat transfer simulation on a reconfigurable computer[C]//International Conference on Field Programmable Logic and Applications.Springer,2004:95-104.
[11] PEPER F.The End of Moore’s Law:Opportunities for Natural Computing?[J].New Generation Computing,2017,35(3):253-269.
[12] WANG Y,BRUN E,MALVAGI F,et al.Competing EnergyLookup Algorithms in Monte Carlo Neutron Transport Calculations and Their Optimization on CPU and Intel MIC Architectures[J].Procedia Computer Science,2016,80:484-495.
[13] LUJAN P,HALYO V,HUNT A,et al.GPU Enhancement of the Trigger to Extend Physics Reach at the Large Hadron Collider[J].Journal of Instrumentation,2013,8(10):14214-14247.
[14] AOYAMA T,ISHIKAWA K I,KIMURA Y,et al.First Application of Lattice QCD to Pezy-SC Processor[J].Procedia Computer Science,2016,80:1418-1427.
[15] KOWALSKI M A,COSGROVE P M.Acceleration of surface tracking in Monte Carlo transport via distance caching[J].Annals of Nuclear Energy,2021,152:108002.
[16] BRUGGER C,DE SCHRYVER C,WEHN N.Hyper:A runtime reconfigurable architecture for monte carlo option pricing in the heston model[C]//2014 24th International Conference on Field Programmable Logic and Applications(FPL).IEEE,2014:1-8.
[17] ZHANG S,WANG Z,PENG Y,et al.Mapping of option pricing algorithms onto heterogeneous many-core architectures[J].The Journal of Supercomputing,2017,73(9):3715-3737.
[18] LI B,LIU J.Heterogeneous cooperative coputing of particletransport based on Monte Carlo method on the tianhe 2A system[J].Computer Engineering and Science,2020,42(11):1922-1928.
[19] RICHARDS D F,BLEILE R C,BRANTLEY P S,et al.Quicksilver:a proxy app for the Monte Carlo transport code mercury[C]//2017 IEEE International Conference on Cluster Computing(CLUSTER).IEEE,2017:866-873.
[20] GOORLEY J T,JAMES M R,BOOTH T E,et al.Initial MCNP6 release overview-MCNP6 version 1.0[R/OL].Los Alamos National Lab.(LANL),Los Alamos,NM(United States).https://doi.org/10.2172/1086758.
[21] ROMANO P K,HORELIK N E,HERMAN B R,et al.OpenMC:A state-of-the-art Monte Carlo code for research and deve-lopment[C]//SNA+MC 2013-Joint International Conference on Supercomputing in Nuclear Applications+Monte Carlo.EDP Sciences,2014:92-97.
[22] LEISERSON C E,THOMPSON N C,EMER J S,et al.There’s plenty of room at the Top:What will drive computer perfor-mance after Moore's law?[J/OL].Science,2020,368(6495).https://www.science.org/doi/10.1126/science.aam9744.
[23] BINKERT N,BECKMANN B,BLACK G,et al.The gem5simulator[J].ACM SIGARCH Computer Architecture News,2011,39(2):1-7.
[24] LI S,AHN J H,STRONG R D,et al.The McPAT Framework for Multicore and Manycore Architectures:Simultaneously Modeling Power,Area,and Timing[J].ACM Transactions on Architecture and Code Optimization,2013,10(1):1-29.
[25] SOHN K,YUN W J,OH R,et al.A 1.2 V 20 nm 307 GB/s HBM DRAM with at-speed wafer-level IO test scheme and adaptive refresh considering temperature distribution[J].IEEE Journal of Solid-State Circuits,2016,52(1):250-260.
[26] ENDO F A,COUROUSSÉ D,CHARLES H P.Micro-architectural simulation of embedded core heterogeneity with gem5 and mcpat[C]//Proceedings of the 2015 Workshop on Rapid Simulation and Performance Evaluation:Methods and Tools.2015:1-6.
[1] HU Yu-jiao, JIA Qing-min, SUN Qing-shuang, XIE Ren-chao, HUANG Tao. Functional Architecture to Intelligent Computing Power Network [J]. Computer Science, 2022, 49(9): 249-259.
[2] LIU Gao-cong, LUO Yong-ping, JIN Pei-quan. Accelerating Persistent Memory-based Indices Based on Hotspot Data [J]. Computer Science, 2022, 49(8): 26-32.
[3] SHUAI Jian-bo, WANG Jin-ce, HUANG Fei-hu, PENG Jian. Click-Through Rate Prediction Model Based on Neural Architecture Search [J]. Computer Science, 2022, 49(7): 10-17.
[4] LIU Yun, DONG Shou-jie. Acceleration Algorithm of Multi-channel Video Image Stitching Based on CUDA Kernel Function [J]. Computer Science, 2022, 49(6A): 441-446.
[5] YE Yue-jin, LI Fang, CHEN De-xun, GUO Heng, CHEN Xin. Study on Preprocessing Algorithm for Partition Reconnection of Unstructured-grid Based on Domestic Many-core Architecture [J]. Computer Science, 2022, 49(6): 73-80.
[6] LUO Jun-ren, ZHANG Wan-peng, LU Li-na, CHEN Jing. Survey on Online Adversarial Planning for Real-time Strategy Game [J]. Computer Science, 2022, 49(6): 287-296.
[7] AO Tian-yu, LIU Quan. Upper Confidence Bound Exploration with Fast Convergence [J]. Computer Science, 2022, 49(1): 298-305.
[8] HE Quan-qi, YU Fei-hong. Review of Low Power Architecture for Wireless Network Cameras [J]. Computer Science, 2021, 48(6A): 369-373.
[9] LIU Dan, GUO Shao-zhong, HAO Jiang-wei, XU Jin-chen. Implementation of Transcendental Functions on Vectors Based on SIMD Extensions [J]. Computer Science, 2021, 48(6): 26-33.
[10] JIANG Hui-min, JIANG Zhe-yuan. Reference Model and Development Methodology for Enterprise Cloud Service Architecture [J]. Computer Science, 2021, 48(2): 13-22.
[11] ZHANG Yuan-ming, YU Jia-rui, JIANG Jian-bo, LU Jia-wei, XIAO Gang. Intermediate Data Transmission Pipeline Optimization Mechanism for MapReduce Framework [J]. Computer Science, 2021, 48(2): 41-46.
[12] JIANG Zheng, WANG Jun-li, CAO Rui-hao, YAN Chun-gang. Method of Service Decomposition Based on Microservice Architecture [J]. Computer Science, 2021, 48(12): 17-23.
[13] YAO Jian-yu, ZHANG Yi-wei, ZHANG Guang-ting, JIA Hai-peng. High Performance Implementation and Optimization of Trigonometric Functions Based on SIMD [J]. Computer Science, 2021, 48(12): 29-35.
[14] XIE Jing-ming, HU Wei-fang, HAN Lin, ZHAO Rong-cai, JING Li-na. Quantum Fourier Transform Simulation Based on “Songshan” Supercomputer System [J]. Computer Science, 2021, 48(12): 36-42.
[15] LIU Tian-xing, LI Wei, XU Zheng, ZHANG Li-hua, QI Xiao-ya, GAN Zhong-xue. Monte Carlo Tree Search for High-dimensional Continuous Control Space [J]. Computer Science, 2021, 48(10): 30-36.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!