Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 241100012-7.doi: 10.11896/jsjkx.241100012
• Computer Software & Architecture • Previous Articles Next Articles
HAN Lin1,2, WU Ruofeng1, LIU Haohao2, NIE Kai2, LI Haoran2, CHEN Mengyao2
CLC Number:
| [1]XIN N J,CHEN X C.Extending the vector instr-uction set for high-performance DSP matrix based on GCC[J].Computer Engineering & Science,2012,34(1):57-63. [2]GAO W,LI Y Y,SUN H H,et al.An improved SIMD Vectorization method for Co-ntrol Flow [J].Journal of Software,2017,28(8):2046-2063. [3]SRERAMAN N,GOVINDARAJAN R.A Vectorizing Compiler for Multi-media Extensions[J].International Journal of Parallel Programming,2000,28(4):363-400. [4]LARSEN S,AMARASINGHE S.Exploiting Superword LevelParallelsm with Multimedia Inst ruction Sets[C]//Conference on Programming Language Design and Implementation.2000:145-156 [5]SUN H H,ZHAO R C,GAO W,et al.Quantification of control Flow Direction Based on Conditional Classification [J].Computer Science,2015,42(11):240-247. [6]SUN H,FEY F,ZHAO J,et al.WCCV:Improvi-ng the vectorization of IF-statements with wa-rpcoherent conditions[C]//Proceedings of the ACM International Conference on Supercomputing.2019:319-329. [7]LANG H,KIPF A,PASSINGL,et al.Make the most out of your SIMD investments:counter control flow divergence in compiled query pipelines[C]//Proceedings of the 14th International Workshop on Data Management on New Hardware.2018:1-8. [8]KHORASANI F,GUPTA R,BHUYAN L N.Efficient warpexecution in presence of divergence with collaborative context collection[C]//Proceedings of the 48th International Symposiumon Microarchitecture.2015:204-215. [9]ALLEN F E,COCKE J.A Catalogue of Optimizing Transformations [M]//Rustin R,ed.Design and Optimization of Compilers.Prentice-Hall,Englewood Cliffs,1972:1-30. [10]LIU B,LAIRD A,TSANG W H,et al.Combining Run-timeChecks and Compile-time Analysis to Improve Control Flow Auto-Vectorization[C]//Proceedings of the International Conference on Parallel Architectures and Compilation Techniques.2022:439-450. [11]SUJON M H,WHALEY R C,YI Q.Vectorization past dependent branches through speculation[C]//Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.IEEE,2013:353-362. [12]FUNG W W L,AAMODT T M.Thread block compaction for efficient SIMT control flow[C]//2011 IEEE 17th International Symposium on High Performance Computer Architecture.IEEE,2011:25-36. [13]ALLEN J,KENNEDY K,PORTERFIELD C,et al.Conversion of Control Dependence to Data Dependence[C]//Annual Symposium on Principles of Programming Languages.1983:177-189. [14]SHIN J,HALL M,CHAME J.Superword-level parallelism in the presence of control flow[C]//International Symposium on Code Generation and Optimization.IEEE,2005:165-175. [15]SHIN J,HALL M W,CHAME J.Evaluating compiler technology for control-flow optimizations for multimedia extension architectures[J].Microprocessors and Microsystems,2009,33(4):235-243. [16]PRAHARENKA W,PANKRATZ D,DE CARVALHO J P L,et al.Vectorizing divergent control flow with active-lane consolidation on long-vector architectures[J].The Journal of Supercomputing,2022,78(10):12553-12588. [17]MOLL S,HACk S.Partial control-flow linearization[J].ACM SIGPLAN Notices,2018,53(4):543-556. [18]SHIN J.Introducing control flow intovectoriz-ed code[C]//16th International Conference on Parallel Architecture and Compilation Techniques(PACT 2007).IEEE,2007:280-291. |
| [1] | XU Jinlong, WANG Gengwu, HAN Lin, NIE Kai, LI Haoran, CHEN Mengyao, LIU Haohao. Research on Parallel Scheduling Strategy Optimization Technology Based on Sunway Compiler [J]. Computer Science, 2025, 52(9): 137-143. |
| [2] | LIU Mengzhen, ZHOU Qinglei, HAN Lin, NIE Kai, LI Haoran, CHEN Mengyao, LIU Haohao. Research on Automatic Vectorization Benefit Evaluation Model Based on Particle SwarmAlgorithm [J]. Computer Science, 2025, 52(7): 248-254. |
| [3] | JIANG Jun, GU Xiaoyang, XU Kunkun, LYU Yongshuai, HUANG Liangming. Design and Research of SIMD Programming Interface for Sunway [J]. Computer Science, 2025, 52(6): 66-73. |
| [4] | LIU Lili, SHAN Zheng, LI Yingying, WU Wenhao, LIU Wenbo. Research on Function Vectorization Technology Based on Directive Statements [J]. Computer Science, 2025, 52(5): 76-82. |
| [5] | WANG Zhen, NIE Kai, HAN Lin. Auto-vectorization Cost Model Based on Instruction MKS [J]. Computer Science, 2024, 51(4): 78-85. |
| [6] | MO Shangfeng, ZHOU Zhenfen, HU Yonghua, XU Minmin, MAO Chunxian, YUAN Yudi. Transplantation and Optimization of Row-vector-matrix Multiplication in Complex Domain Based on FT-M7002 [J]. Computer Science, 2023, 50(11A): 220900277-6. |
| [7] | LIANG Yao, XIE Chun-li, WANG Wen-jie. Code Similarity Measurement Based on Graph Embedding [J]. Computer Science, 2022, 49(11A): 211000186-6. |
| [8] | SHI Rui-heng, ZHU Yun-cong, ZHAO Yi-ru, ZHAO Lei. Semantic Restoration and Automatic Transplant for ROP Exploit Script [J]. Computer Science, 2022, 49(11): 49-54. |
| [9] | GAO Xiu-wu, HUANG Liang-ming, JIANG Jun. Optimization Method of Streaming Storage Based on GCC Compiler [J]. Computer Science, 2022, 49(11): 76-82. |
| [10] | YAO Jian-yu, ZHANG Yi-wei, ZHANG Guang-ting, JIA Hai-peng. High Performance Implementation and Optimization of Trigonometric Functions Based on SIMD [J]. Computer Science, 2021, 48(12): 29-35. |
| [11] | LI Shuang, ZHAO Rong-cai, WANG Lei. Implementation and Optimization of Sunway1621 General Matrix Multiplication Algorithm [J]. Computer Science, 2021, 48(11A): 699-704. |
| [12] | HAN Lei, HU Jian-peng. Deduplication Algorithm of Abstract Syntax Tree in GCC Based on Trie Tree of Keywords [J]. Computer Science, 2020, 47(9): 47-51. |
| [13] | YANG Hao-ran, FANG Xian-wen. Business Process Consistency Analysis of Petri Net Based on Probability and Time Factor [J]. Computer Science, 2020, 47(5): 59-63. |
| [14] | GONG Tong-yan,ZHANG Guang-ting,JIA Hai-peng,YUAN Liang. High-performance Implementation Method for Even Basis of Cooley-Tukey FFT [J]. Computer Science, 2020, 47(1): 31-39. |
| [15] | SIDIKE Pa-erhatijiang, MA Jian-feng, SUN Cong. Fine-grained Control Flow Integrity Method on Binaries [J]. Computer Science, 2019, 46(11A): 417-420. |
|
||