Computer Science ›› 2024, Vol. 51 ›› Issue (12): 110-119.doi: 10.11896/jsjkx.230800106
• High Performance Computing • Previous Articles Next Articles
HE Haotian, ZHOU Bei, GUO Shaozhong, ZHANG Zuoyan, HAO Jiangwei, XU Jinchen
CLC Number:
[1]FAWZI A,BALOG M,HUANG A.et al.Discovering faster matrix multiplication algorithms with reinforcement learning[J].Nature,2022,610:47-53. [2]MICIKEVICIUS P,MARANG S,ALBEN J,et al.Mixed precision training for deep neural networks[J].arXiv:1710.03740,2017. [3]HUANG X,ZHANG J,CHEN T,et al.Mixed-Precision Trai-ning for NLP and Speech Recognition with OpenSeq2Seq[J].ar-Xiv:2010.12895,2018. [4]VERDOOLAEGE S,CARLOS JUEGA J,COHEN A,et al.Polyhedral Parallel Code Generation for CUDA[J].ACM Transactions on Architecture & Code Optimization,2013,9(4):1-23. [5]DUGHMI S,XU H F.Algorithmic Bayesian persuasion[C]//Proceedings of the forty-eighth annual ACM symposium on Theory of Computing(STOC’16).Association for Computing Machinery,New York,NY,USA,2016:412-425. [6]DEMMEL J.Applied Numerical Linear Algebra[M].Tsinghua University Press:Siam,2011. [7]WANG B Y,PANG J M,XU J L,et al.Matrix multiplication vector code generation based on polyhedral model[J].Computer Science,2022,49(10):44-51. [8]KOTIPALLI P V,SINGH R,WOOD P,et al.AMPT-GA:Automatic mixed precision floating point tuning for GPU applications[C]//Proceedings of the ACM International Conference on Supercomputing(ICS’19).Phoenix Arizona:ACM,2019:160-170. [9]SOLOVYEV A,JACOBSEN C,RAKAMARIĆZ,et al.Rigorousestimation of floating-point round-off errors with symbolic taylor expansions[M].Cham:Springer International Publishing,2015:532-550. [10]DARULOVA E,HORN E,SHARMA S.Sound mixed-precision optimization with rewriting[C]//2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems(ICCPS).IEEE,2018:208-219. [11]DARULOVA E,KUNCAK V.Towards a compiler for reals[J].ACM Transactions on Programming Languages and Systems(TOPLAS),2017,39(2):1-28. [12]CHERUBIN S,CATTANEO D,CHIARI M,et al.TAFFO:Tuning assistant for floating to fixed point optimization[J].IEEE Embedded Systems Letters,2019,12(1):5-8. [13]RUBIO-GONZÁLEZ C,NGUYEN C,NGUYEN H D,et al.Precimonious:Tuning assistant for floating-point precision[C]//Proceedings of the International Conference on High Performance Computing,Networking,Storage and Analysis.IEEE,2013:1-12. [14]RUBIO-GONZÁLEZ C,NGUYEN C,MEHNE B,et al.Floa-ting-point precision tuning using blame analysis[C]//2016 IEEE/ACM 38th International Conference on Software Engineering(ICSE).IEEE,2016:1074-1085. [15]MENON H,LAM M O,OSEI-KUFFUOR D,et al.ADAPT:Algorithmic differentiation applied to floatingpoint precision tuning[C]//International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE,2018:614-626. [16]HO N M,MANOGARAN E,WONG W F,et al.Efficient floa-ting point precision tuning for approximate computing[C]//2017 22nd Asia and South Pacific Design Automation Conference(ASP-DAC).IEEE,2017:63-68. [17]LAGUNA I,WOOD P C,SINGH R,et al.Gpumixer:Perfor-mance-driven floating-point tuning for gpu scientific applications[C]//International Conference on High Performance Computing.Cham:Springer,2019:227-246. [18]FEAUTRIER P,LENGAUER C.Polyhedron model[C]//Proceedings of the Encyclopedia of Parallel Computing.2011:1581-1592. [19]ZHAO J,LI Y Y,ZHAO R C.Compilation “black magic” based on polyhedral model[J].Journal of Software,2018,29(8):2371-2396. [20]BONDHUGULA U,HARTONO A,RAMANUJAM J,et al.A practical automatic polyhedral parallelizer and locality optimizer[C]//Proceedings of the 29th ACM SIGPLAN ConfErence on Programming Language Design and Implementation(PLDI).ACM Press,2008:101-113. [21]TRIFUNOVIĆ K,COHEN A,EDELSOHN D,et al.Graphite two years a fter:First lessons learned from real-world polyhe-dral compilation[C]//Proceedings of the 2nd GCC Research Opportunities Workshop(GROW).2010. [22]GROSSER T,GROESSLINGER A,LENGAUER C.Polly-Performing polyhedral optimizations on a low-level intermediate representation[J].Parallel Processing Letters,2012,22(4):1250010. [23]KELLY W,PUGH W.A unifying framework for iteration reordering transformations[C]//Proceedings of the IEEE 1st International Conference on Algorithms and Architectures for Parallel Processing(ICAPP’95).1995. [24]GIRBAL S,VASILACHE N,BASTOUL C,et al.Semi-Auto-matic Composition of Loop Transformations for Deep Paralle-lism and Memory Hierarchies[J].International Journal of Parallel Programming,2006,34(3):261-317. [25]VERDOOLAEGE S.Counting affine calculator and applications[C]//Proceedings of the 1st International Workshop on Polyhedral Compilation Techniques(IMPACT’11).Charmonix,2011. [26]GUO H,RUBIO-GONZÁLEZ C.Exploiting community struc-ture for floating-point precision tuning[C]//Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis.Amsterdam Netherlands:ACM,2018:333-343. [27]VERDOOLAEGE S,GROSSER T.Polyhedral extraction tool[C]//Proceedings of the 2nd Int’l Workshop on Polyhedra l Compilation Techniques(IMPACT).2012. [28]VERDOOLAEGE S.ISL:An integer set library for the polyhe-dral model[C]//Proceedings of the ICMS 2010.Berlin,Heidelberg:Springer-Verlag,2010:299-302. [29]SONG G H,GUO S Z,ZHAO J,et al.Automatic hybrid accuracy optimization for Stencil computing[J].Journal of Software,2023,34(12):5704-5723. |
[1] | GUO Shuaizhe, GAO Jianhua, JI Weixing. Optimizing Distributed GMRES Algorithm with Mixed Precision [J]. Computer Science, 2024, 51(9): 15-22. |
[2] | LING Shixiang, YANG Zhibin, ZHOU Yong. Integrated Avionics Software Code Automatic Generation Method for ARINC653 Operating System [J]. Computer Science, 2024, 51(7): 10-21. |
[3] | LEI Chao, LIU Jiang, SONG Jiawen. Time Cost Model and Optimal Configuration Method for GPU Parallel Computation of Matrix Multiplication [J]. Computer Science, 2024, 51(6A): 230300200-8. |
[4] | XU Yiran, ZHOU Yu. Prompt Learning Based Parameter-efficient Code Generation [J]. Computer Science, 2024, 51(6): 61-67. |
[5] | LIAO Qihua, NIE Kai, HAN Lin, CHEN Mengyao, XIE Wenbing. Tile Selection Algorithm Based on Data Locality [J]. Computer Science, 2024, 51(12): 100-109. |
[6] | HE Haotian, ZHOU Bei, GUO Shaozhong, ZHANG Zuoyan, HAO Jiangwei, JI Liguang, XU Jinchen. Automatic Mixing Precision Optimization for Matrix Multiplication Calculation [J]. Computer Science, 2024, 51(11A): 240300057-10. |
[7] | PEI Xue, WEI Shuai, SHAO Yangxue, YU Hong, GE Chenyang. Compilation Optimization and Implementation of High-order Cryptographic Operators on FPGA [J]. Computer Science, 2024, 51(11A): 231200184-11. |
[8] | MO Shangfeng, ZHOU Zhenfen, HU Yonghua, XU Minmin, MAO Chunxian, YUAN Yudi. Transplantation and Optimization of Row-vector-matrix Multiplication in Complex Domain Based on FT-M7002 [J]. Computer Science, 2023, 50(11A): 220900277-6. |
[9] | ZHU Jian, HU Kai, WANG Jun, LI Jie, YE Yafei, SHI Xiyan. Reliable Smart Contract Automatic Generation Based on Event-B [J]. Computer Science, 2023, 50(10): 343-349. |
[10] | GAO Xiu-wu, HUANG Liang-ming, JIANG Jun. Optimization Method of Streaming Storage Based on GCC Compiler [J]. Computer Science, 2022, 49(11): 76-82. |
[11] | WANG Bo-yang, PANG Jian-min, XU Jin-long, ZHAO Jie, TAO Xiao-han, ZHU Yu. Matrix Multiplication Vector Code Generation Based on Polyhedron Model [J]. Computer Science, 2022, 49(10): 44-51. |
[12] | CHEN Tao, SHU Hui, XIONG Xiao-bing. Study of Universal Shellcode Generation Technology [J]. Computer Science, 2021, 48(4): 288-294. |
[13] | HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li. Parallel WMD Algorithm Based on GPU Acceleration [J]. Computer Science, 2021, 48(12): 24-28. |
[14] | HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong. Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation [J]. Computer Science, 2021, 48(12): 49-58. |
[15] | HAN Xiao-dong, GAO Fei, ZHANG Li-wei. Novel Real-time Algorithm for Critical Path of Linear Network Coding [J]. Computer Science, 2020, 47(9): 232-237. |
|