Computer Science ›› 2021, Vol. 48 ›› Issue (6): 1-9.doi: 10.11896/jsjkx.201200115

• Computer Architecture • Previous Articles     Next Articles

Performance Skeleton Analysis Method Towards Component-based Parallel Applications

FU Tian-hao1,3, TIAN Hong-yun1, JIN Yu-yang2, YANG Zhang1, ZHAI Ji-dong2, WU Lin-ping1, XU Xiao-wen1   

  1. 1 Institute of Applied Physics and Computational Mathematics,Beijing 100094,China
    2 Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
    3 Graduate School of China Academy of Engineering Physics (CAEP),Beijing 100088,China
  • Received:2020-12-12 Revised:2021-03-23 Online:2021-06-15 Published:2021-06-03
  • About author:FU Tian-hao,born in 1996,master candidate,is a student member of China Computer Federation.His main research interests include high perfor-mance computing and performance ana-lysis for parallel applications.(futianhao18@gscaep.ac.cn)
    XU Xiao-wen,born in 1978,Ph.D professor.His research interests include high performance numerical algorithm &software in scientific and engineering fields,parallel programming framework for large-scale numerical simulations.He is member of China Computer Fe-deration,SIAM and CSIAM.
  • Supported by:
    National Key R&D Program of China(2017YFB0202103) and Science Challenge Project(TZ2019002).

Abstract: Performance skeleton analysis technology (PSTAT) provides input parameters for performance modeling of parallel applications by describing the program structure of parallel applications.PSTAT is the basis of performance analysis and performance optimization for large-scale parallel applications.Aiming at a kind of component-based parallel applications in the field of numerical simulation,based on the dynamic and static application structure analysis technology oriented to general program binary file,this paper proposes and implements an automatic performance skeleton generation method based on “component-loop-call” tree.On this foundation,a performance skeleton analysis toolkit CLCT-STAT(Component-Loop-Call-Tree SkeleTon Analysis Toolkit) is developed.This method can automatically identify the function symbols of component class members in component-based applications,and generate the performance skeleton of parallel application with component as the smallest unit.Compared with the method of manual generation of performance skeleton by analytical modeling,the proposed method can provide more program structure information and save the cost of manual analysis.

Key words: CLCT-STAT, Component-loop-call tree, Parallel computing component, Performance skeleton

CLC Number: 

  • TP302
[1]YU D.Research for Scientific Computing on Large Scales[J].China Basic Science,2001(1):19-25.
[2]ZHAO G L.The 55th Global Supercomputing TOP500 List Released[EB/OL].China Science Daily.(2020-06-24)[2020-07-06].http://news.science net.cn/sbhtml news/202016/356058.shtm.
[3]YANG X J.Sixty Years of Parallel Computing[J].ComputerEngineering & Science,2012,34(8):1-10.
[4]DATTA K,KAMIL S,OLIKER L,et al.Optimization and performance modeling of stencil computations on modern microprocessors[J].Siam Review,2009,51(1):129-159.
[5]DING N.Research on Automatic Performance Modeling Technique for Large Scale Scientific Computing Applications[D].Beijing:Tsinghua University,2018.
[6]CHEN Z X,ZHAN J Y,HAO Z B.Method for Static Function Call Analysis with Control Flow[J].Computer Engineering,2011,37(9):47-50.
[7]KOMONDOOR R,HORWITZ S.Using Slicing to Identify Duplication in Source Code[C]//International Symposium onSta-tic Analysis.Springer-Verlag,2001.
[8]WANG X D,ZHANG Y K.Analysis of the C++ Source Program Structure Based on GCC Abstract Syntax Tree[J].Computer Engineering and Applications,2006,42(23):97-99.
[9]MO Z Y.Progress on High Performance Programming Framework for Numerical Simulation[J].E-science Technology & Application,2015,6(4):11-19.
[10]DOE Workshop Report.Exascale Programming Challenges[OL].http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/ProgrammingChallengesWorkshopReport.pdf.
[11]MO Z Y,ZHANG A Q,LIU Q K,et al.Research on the components and practices for domain-specific parallel programming models for numerical simulation[J].Scientia Sinica Informa-tionis,2015,45(3):385-397.
[12]MO Z Y,ZHANG A Q,LIU Q K,et al.Parallel algorithm and parallel programming:from specialty to generality as well as software reuse[J].Scientia Sinica Informationis,2016,46(10):1392-1410.
[13]MO Z Y,ZHANG A Q,CAO X L,et al.JASMIN:a parallelsoftware infrastructure for scientific computing[J].Frontiers of Computer Science in China,2010,4(4):480-488.
[14]HORNUNG R D,KOHN S R.Managing Application Complexity in the SAMRAI Object-Oriented Framework[J].Concurrency &Computation Practice & Experience,2002,14(5):347-368.
[15]SHYUE K M.A Fluid-Mixture Type Algorithm for Compressible Multicomponent Flow with Mie-Grüneisen Equation of State[J].Journal of Computational Physics,2001,171(2):678-707.
[16]LIU Q K,ZHAO W B,CHEN J,et al.A Programming Framework for Large Scale Numerical Simulations on Unstructured Mesh[C]// 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity),IEEE International Conference on High Performance and Smart Computing (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS).IEEE,2016.
[17]STEWART J R,EDWARDS H C.The SIERRA Framework for Developing Advanced Parallel Mechanics Applications[M].Large-Scale PDE-Constrained Optimization.Berlin Heidelberg:Springer,2003.
[18]ZHANG L B,ZHENG W Y,LU B Z,et al.The toolbox PHG and its applications[J].Scientia Sinica Informationis,2016(10):1442-1464.
[19]ZHANG B Y,LI G,DENG L,et al.Research and Development of JCOGIN for Monte Carlo Particle Transport Code[J].Atomic Energy Science and Technology,2013,47(z2):448-452.
[20]SEFL M.Geant4 simulation toolkit[J].Nuclear Instruments & Methods in Physics Research,2012,506(3):250-303.
[21]YANG F Q,MEI H,HUANG G.Design and Implementation of Component-Based Software[M].Tsinghua University Press,2008.
[22]MO Z Y,ZHANG A Q.JASMIN2.0 User Guide[M].Institute of Applied Physics and Computational Mathematics,2011.
[23]BHATTACHARYYA A,HOEFLER T.PEMOGEN:Automa-tic Adaptive Performance Modeling During Program Runtime[C]//Proceedings of the 23rd International Conference on Pa-rallel Architectures and Compilation.ACM,2014:393-404.
[24]BHATTACHARYYA A,KWASNIEWSKI G,HOEFLER T.Using Compiler Techniques to Improve Automatic Performance Modeling[C]//International Conference on Parallel Architecture & Compilation.IEEE,2015.
[25]ZHAI J D,HU J F,TANG X C,et al.CYPRESS:Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression[C]//Proceedings of the International Con-ference for High Performance Computing,Networking,Storageand Analysis,SC14.New Orleans:LA,2014:143-153.
[26]JASMIN[EB/OL].http://www.caep-scns.ac.cn/JASMIN.php.
[1] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[2] WU Gong-xing, Sun Zhao-yang, JU Chun-hua. Closed-loop Supply Chain Network Design Model Considering Interruption Risk and Fuzzy Pricing [J]. Computer Science, 2022, 49(7): 220-225.
[3] FU Si-qing, LI Tie-jun, ZHANG Jian-min. Architecture Design for Particle Transport Code Acceleration [J]. Computer Science, 2022, 49(6): 81-88.
[4] Jeffrey ZHENG. Meta Knowledge Intelligent Systems on Resolving Logic Paradoxes [J]. Computer Science, 2022, 49(1): 9-16.
[5] LIU Wei, RUAN Min-jie, SHE Wei, ZHANG Zhi-hong, TIAN Zhao. PBFT Optimized Consensus Algorithm for Internet of Things [J]. Computer Science, 2021, 48(11): 151-158.
[6] CUI Guo-nan, WANG Li-song, KANG Jie-xiang, GAO Zhong-jie, WANG Hui, YIN Wei. Fuzzy Clustering Validity Index Combined with Multi-objective Optimization Algorithm and Its Application [J]. Computer Science, 2021, 48(10): 197-203.
[7] GUO Biao, TANG Qi, WEN Zhi-min, FU Juan, WANG Ling, WEI Ji-bo. List-based Software and Hardware Partitioning Algorithm for Dynamic Partial Reconfigurable System-on-Chip [J]. Computer Science, 2021, 48(6): 19-25.
[8] YU Jian-ye, QI Yong, WANG Bao-zhuo. Distributed Combination Deep Learning Intrusion Detection Method for Internet of Vehicles Based on Spark [J]. Computer Science, 2021, 48(6A): 518-523.
[9] ZHANG Hang, TANG Dan, CAI Hong-liang. Study on Predictive Erasure Codes in Distributed Storage System [J]. Computer Science, 2021, 48(5): 130-139.
[10] E Hai-hong, ZHANG Tian-yu, SONG Mei-na. Web-based Data Visualization Chart Rendering Optimization Method [J]. Computer Science, 2021, 48(3): 119-123.
[11] WANG Yan, HAN Xiao, ZENG Hui, LIU Jing-xin, XIA Chang-qing. Task Migration Node Selection with Reliable Service Quality in Edge Computing Environment [J]. Computer Science, 2020, 47(10): 240-246.
[12] WANG Zhe, TANG Qi, WANG Ling, WEI Ji-bo. Joint Optimization Algorithm for Partition-Scheduling of Dynamic Partial Reconfigurable Systems Based on Simulated Annealing [J]. Computer Science, 2020, 47(8): 26-31.
[13] WANG Guo-peng, YANG Jian-xin, YIN Fei, JIANG Sheng-jian. Computing Resources Allocation with Load Balance in Modern Processor [J]. Computer Science, 2020, 47(8): 41-48.
[14] ZHUANG Yi, YANG Jia-hai. Multi-source Tree-based Scheduling Algorithm for Deadline-aware P2MP Inter-datacenter Transfers [J]. Computer Science, 2020, 47(7): 213-219.
[15] ZHU Li-hua, WANG Ling, TANG Qi, WEI Ji-bo. Efficient MILP Model for HW/SW Partitioning of Dynamic Partial Reconfigurable SoC [J]. Computer Science, 2020, 47(4): 18-24.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!