Computer Science ›› 2026, Vol. 53 ›› Issue (6): 145-152.doi: 10.11896/jsjkx.251000117

• High Performance Computing • Previous Articles     Next Articles

Research on Fortran Compiler Implementation Technology on CPU-DSP Heterogeneous Processor

ZHU Pengzhi, HUANG Chun, SHEN Jie, CHEN Cheng, XU Haoran, LONG Biao   

  1. College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China
  • Received:2025-10-27 Revised:2026-01-13 Online:2026-06-15 Published:2026-06-09
  • About author:ZHU Pengzhi,born in 2003.His main research interests include compiler and processor design verification.
    SHEN Jie,born in 1987,Ph.D,associate professor,master's supervisor.Her main research interests include parallel programming,heterogeneous computing,and high performance mathematic library.
  • Supported by:
    National Key Research and Development Program of China(2023YFB3001601).

Abstract: The Fortran language is widely applied in scientific and engineering computing domains.However,general-purpose digital signal processors(GPDSPs)currently rely primarily on C or assembly languages for programming,lacking Fortran support at present.Addressing this gap,this paper investigates the implementation technology of Fortran compilers targeting CPU-DSP he-terogeneous processors.Based on the LLVM Flang compiler framework,it designs and implements a Fortran compiler prototype named mtFortran,completing the Flang frontend migration while focusing on resolving compilation and runtime support challenges for Fortran programs in heterogeneous architectures.Key issues addressed include program loading/execution,syntax compatibility,built-in function implementation,and input/output(I/O)system adaptation.Experimental results demonstrate that this Fortran compiler successfully supports the syntactic features of the GUIDE F90 test suite within the commercial U_F90_TS_LITE benchmark collection.Among the 176 test programs are evaluated,all 102 programs with supported runtime libraries are passed heterogeneous compilation and execution validation.The implementation rate for built-in functions across major categories reaches 79.38%,enabling support for typical high-performance computing applications(e.g.,the NPB-EP benchmark program).This work establishes foundational runtime capabilities for Fortran programs on CPU-DSP heterogeneous processor architectures,laying the groundwork for subsequent enhancements in standard compliance,performance optimization,and parallelization extensions.

Key words: CPU-DSP heterogeneous processors, Fortran compiler, Fortran runtime, LLVM, Flang

CLC Number: 

  • TP391
[1]LI R,WANG Q,LIU J.A heterogeneous parallel algorithm for the Cartesian discrete ordinates for multizone heterogeneous system[J].The Journal of Supercomputing,2025,81(4):593.
[2]ADAMS J,BRAINERD W.A little history and aFortran 90summary[J].Computer Standards & Interfaces,1996,18(4):279-289.
[3]TIOBE Software BV.TIOBE index-TIOBE[EB/OL].(2025-05-02)[2025-05-10].https://www.tiobe.com/tiobe-index/.
[4]CATS G,WOLTERS L.TheHirlam project [meteorology][J].IEEE Computational Science and Engineering,1996,3(4):4-7.
[5]HOHENKERKC Y.SOFA and the algorithms for transformations between scales & between systems[M]// Journées Systèmes de Référence Spatio-temporels 2011.2012:21-24.
[6]BUSS O,GAITANOS T,GALLMEISTER K,et al.Transport-theoretical description of nuclear reactions[J].Physics Reports,2012,512(1):1-124.
[7]BUCEK J,LANGE K D,KISTOWSKI J V.SPEC CPU2017:Next-generation compute benchmark[C]//Companion of the 2018 ACM/SPEC International Conference on Performance Engineering.New York:ACM,2018:41-42.
[8]LATTNER C,ADVE V.LLVM:a compilation framework for lifelong program analysis & transformation[C]//International Symposium on Code Generation and Optimization.2004:75-86.
[9]RASMUSSEN K,ROUSON D,BONACHEA D.Agile acceleration of LLVM Flang support for Fortran 2018 parallel programming[C]//Proceedings of SC Supercomputing Conference(SC 2022).Dallas,TX:Lawrence Berkeley National Laboratory,2022.
[10]YIN S,WANG Q,HAO R,et al.Optimizing irregular-shaped matrix-matrix multiplication on multi-coreDSPs[C]//2022 IEEE International Conference on Cluster Computing(CLUSTER).IEEE Computer Society,2022:451-461.
[11]MA S,LIU Z,CHEN S,et al.CoordinatedDMA:Improving the DRAM access efficiency for matrix multiplication[J].IEEE Transactions on Parallel and Distributed Systems,2019,30(10):2148-2164.
[12]SHI Y,CHEN Z Y,SUN H Y,et al.Design of Autonomous Software Stack for Phytium Matrix DSP [J].Computer Engineering and Science,2024,46(6):968-976.
[13]ZHANG P,FANG J,YANG C,et al.MOCL:an efficientOpenCL implementation for the matrix-2000 architecture[C]//Proceedings of the 15th ACM International Conference on Computing Frontiers(CF'18).New York:ACM,2018:26-35.
[14]The Flang Team.FlangFortran standards support-the Flangcompiler [EB/OL].(2025-05-22)[2025-05-22].https://flang.llvm.org/docs/FortranStandardsSupport.html.
[15]LLVM Foundation.llvm.org/license.txt[EB/OL].(2025-05-13)[2025-05-13].https://llvm.org/LICENSE.txt.
[16]OSMIALOWSKI P.How theFlang frontend works:Introduction to the interior of the open-source Fortran frontend for LLVM[C]//Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC(LLVM-HPC'17).New York:ACM,2017.
[17]2023 Information technology—Programming languages-FORTRAN:ISO/IEC 1539-1[S].Geneva,Switzerland:International Organization for Standardization,2023.
[18]LENZ H J.Review of:Programmer's Guide to Fortran 90,3rd ed[J].Computational Statistics and Data Analysis,1997,25(4):494.
[19]COMMUNITY F.Fortran intrinsics-Fortran programminglanguage [EB/OL].(2025-05-13)[2025-05-13].https://fortran-lang.org/learn/intrinsics.
[20]NASA.NAS Parallel Benchmarks [EB/OL].(2024-06-18)[2025-05-13].https://www.nas.nasa.gov/software/npb.html.
[21]MARTINS E M,FAÉ L G,HOFFMANN R B,et al.NPB-Rust:NAS Parallel Benchmarks in Rust[J].arXiv:2502.15536,2025.
[1] JIANG Jun, ZHAI Yanhe, ZENG Zhiheng, GU Yichao, HUANG Liangming. Loop-invariant Code Motion Algorithm Based on Loop Cost Analysis [J]. Computer Science, 2025, 52(6): 44-51.
[2] ZHAO Chenxia, SHU Hui, SHA Zihan. Cross-architecture Cryptographic Algorithm Recognition Based on IR2Vec [J]. Computer Science, 2023, 50(6A): 220100255-7.
[3] CHEN Tao, SHU Hui, XIONG Xiao-bing. Study of Universal Shellcode Generation Technology [J]. Computer Science, 2021, 48(4): 288-294.
[4] HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong. Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation [J]. Computer Science, 2021, 48(12): 49-58.
[5] HU Hao, SHEN Li, ZHOU Qing-lei and GONG Ling-qin. Node Fusion Optimization Method Based on LLVM Compiler [J]. Computer Science, 2020, 47(6A): 561-566.
[6] ZHANG Qi-liang, ZHANG Yu and ZHOU Kun. CCodeExtractor:Automatic Approach of Function Extraction for C Programs [J]. Computer Science, 2017, 44(4): 16-20.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!