Computer Science ›› 2026, Vol. 53 ›› Issue (6): 163-170.doi: 10.11896/jsjkx.251000116
• High Performance Computing • Previous Articles Next Articles
SHI Jun, WANG Qinglin, TIAN Feiyang, WANG Zhicheng, LI Runhua, LIU Jie
CLC Number:
| [1]YANG C,AYDIN B,OWENS J D.Design principles for sparse matrix multiplication on the gpu[C]//European Conference on Parallel Processing.Cham:Springer,2018. [2]WU T,WANG B,SHAN Y,et al.Efficient pagerank and spmv computation on amd gpus[C]//2010 39th International Conference on Parallel Processing.IEEE,2010:81-89. [3]FU Q,ROLINGER T B,HUANG H H.JITSPMM:Just-in-time instruction generation for accelerated sparse matrix-matrix multiplication[C]//2024 IEEE/ACM International Symposium on Code Generation and Optimization(CGO).IEEE,2024:448-459. [4]LANGVILLE A N,MEYER C D.Google's PageRank and beyond:The science of search engine rankings[M].Princeton University Press,2006. [5]KOREN Y,BELL R,VOLINSKY C.Matrix factorization techniques for recommender systems[J].Computer,2009,42(8):30-37. [6]SCHAEFFER S E.Graph clustering[J].Computer Science Review,2007,1(1):27-64. [7]WANG E,ZHANG Q,SHEN B,et al.Intel Math Kernel Library[J].2014.DOI:10.1007/978-3-319-06486-4_7. [8]VIRTANEN P,GOMMERS R,OLIPHANT T E,et al.Fundamental algorithms for scientific computing in python and SciPy 1.0 contributors.SciPy 1.0[J].Nature Methods,2020,17:261-272. [9]WANG M Y.Deep graph library:Towards efficient and scalable deep learning on graphs[C]//ICLR Workshop on Representation Learning on Graphs and Manifolds.2019. [10]FEY M,LENSSEN J E.Fast graph representation learning with PyTorch Geometric[J].arXiv:1903.02428,2019. [11]SELVITOPI O,BROCK B,NISA I,et al.Distributed-memoryparallel algorithms for sparse times tall-skinny-dense matrix multiplication[C]//Proceedings of the 35th ACM International Conference on Supercomputing.2021:431-442. [12]ZHANG Y,YANG W,LI K,et al.Performance analysis and optimization for SpMV based on aligned storage formats on an ARM processor[J].Journal of Parallel and Distributed Computing,2021,158:126-137. [13]ZHANG W,JIANG Z,CHEN Z,et al.NUMA-Aware DGEMM based on 64-bit ARMv8 multicore processors architecture[J].Electronics,2021,10(16):1984. [14]ZHENG J,JIANG J,DU J,et al.Optimizing massively parallel sparse matrix computing on ARM many-core processor[J].Parallel Computing,2023,117:103035. [15]SATO M,ISHIKAWA Y,TOMITA H,et al.Co-design fora64fx manycore processor and” fugaku”[C]//SC20:International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE,2020:1-15. [16]HUANG G,DAI G,WANG Y,et al.Ge-spmm:General-purpose sparse matrix-matrix multiplication on gpus for graph neural networks[C]//SC20:International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE,2020:1-12. [17]INOUE H,OHARA M,TAURA K.Faster set intersection with SIMD instructions by reducing branch mispredictions[J].Proceedings of the VLDB Endowment,2014,8(3):293-304. [18]SHAYLOR N.A {Just-in-Time} Compiler for {Memory-Constrained}{Low-Power} Devices[C]//2nd Java Virtual Machine Research and Technology Symposium(Java VM 02).2002. [19]FU Q,JI Y,HUANG H H.TLPGNN:A lightweight two-level parallelism paradigm for graph neural network computation on GPU[C]//Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing.2022:122-134. [20]MERRILL D,GARLAND M.Merge-based parallel sparse matrix-vector multiplication[C]//Proceedings of the International Conference for High Performance Computing(SC'16).Networking,Storage and Analysis.IEEE,2016:678-689. [21]PAI V S,RANGANATHAN P,ADVE S V.The impact of instruction-level parallelism on multiprocessor performance and simulation methodology[C]//Proceedings Third International Symposium on High-Performance Computer Architecture.IEEE,1997:72-83. [22]HENNESSY J L,PATTERSON D A.Computer architecture:a quantitative approach[M].Elsevier,2011. [23]DAVIS T A,HU Y.The University of Florida sparse matrix collection[J].ACM Transactions on Mathematical Software,2011,38(1):1-25. [24]CHANDRA R.Parallel programming in OpenMP[M].Morgan Kaufmann,2001. [25]ACER S,SELVITOPI,O,AYKANAT C.Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems[J].Parallel Computing,2016,59:71-96. [26]HU Y,YE Z,WANG M,et al.Featgraph:A flexible and efficient backend for graph neural network systems[C]//SC20:International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE,2020:1-13. [27]GUO M,WANG Y,GU Y,et al.Bs-SpMM:Accelerate Sparse Matrix-Matrix Multiplication by Balanced Split Strategy on the GPU[C]//IEEE INFOCOM 2023-IEEE Conference on Computer Communications Workshops(INFOCOM WKSHPS).IEEE,2023:1-6. [28]CAO L,WANG Q,YANG S,et al.LSSM-SpMM:A Long-Row Splitting and Short-Row Merging Approach for Parallel SpMM on PEZY-SC3s[C]//International Conference on Algorithms and Architectures for Parallel Processing.Singapore:Springer,2024:78-97. |
| [1] | KE Changbo, LI Tianhao, ZHANG Bolei, XIAO Fu, XU Kang. Teaching Evaluation Sentiment Analysis Method Based on Capsule Network [J]. Computer Science, 2026, 53(6): 10-18. |
| [2] | LIU Ruyi, LYU Xiaohan, MIAO Qiguang, LU Zixiang, WANG Di. Academic Early Warning Prediction Model Based on Attention Mechanism and FeatureInteraction [J]. Computer Science, 2026, 53(6): 19-29. |
| [3] | XIE Hui, LIANG Dan, YANG Huiting, JIA Chunli, HE Jiangshan, DONG Zexiao, REN Ziqi, JIANG Mingzhe, CHEN Xueli. Research on Adaptive Disciplinary Learning Effectiveness Evaluation Model Driven by PrefrontalEEG [J]. Computer Science, 2026, 53(6): 39-49. |
| [4] | SHANG Yi, YING Di, ZHAO Hui. Multi-task Classroom Title Generation Method Integrates Core Sentences and Keyword Guidance [J]. Computer Science, 2026, 53(6): 50-58. |
| [5] | XU Zhihong, YANG Xinlei, WANG Liqin, DONG Yongfeng, WANG Xu. Knowledge Tracing Model Based on Relational Learning Memory Network [J]. Computer Science, 2026, 53(6): 84-92. |
| [6] | ZHAO Lei, YANG Yulu, YUAN Bo. Personalized Course Recommendation System Based on Knowledge Graph [J]. Computer Science, 2026, 53(6): 93-101. |
| [7] | ZHU Huming, LIU Huijie, DONG Ximiao, CHEN Zhipeng, GAO Tianqi, JIAO Licheng. Review on Parallel Training and Inference of Diffusion Models [J]. Computer Science, 2026, 53(6): 102-116. |
| [8] | LI Zhenjia, WANG Wu. Kokkos-based Direct Solver and Its Implementation on Heterogeneous Platform [J]. Computer Science, 2026, 53(6): 137-144. |
| [9] | ZHU Pengzhi, HUANG Chun, SHEN Jie, CHEN Cheng, XU Haoran, LONG Biao. Research on Fortran Compiler Implementation Technology on CPU-DSP Heterogeneous Processor [J]. Computer Science, 2026, 53(6): 145-152. |
| [10] | LI Jinyou, ZHANG Wenshuai, SHEN Yu, ZHANG Yundong, LI Huimin, LI Jing. Machine Learning-based Parallel Parameter Optimization in High-performance ComputingApplications [J]. Computer Science, 2026, 53(6): 153-162. |
| [11] | LIU Zhongyi, XIAO Wei, ZHANG Lei, YAN Songbai, HUANG Xiangping, LI Mengxiao. MMCache:High-performance Cluster Cache with Memory-mapped Mirroring [J]. Computer Science, 2026, 53(6): 203-213. |
| [12] | JI Wenyu, LI Yang, WANG Jiabao, FU Ruizhi, LIU Xiaoyu, MIAO Zhuang. Review of 3D Object Detection Based on LiDAR-camera Fusion [J]. Computer Science, 2026, 53(6): 214-231. |
| [13] | LI Xiuying, CHEN Xuesong, LI Haoze, LIAO Hongwei, HAN Jiameng, DUAN Xiaoyi. MambaCS:Mamba-based Image Compressed Sensing Algorithm [J]. Computer Science, 2026, 53(6): 232-241. |
| [14] | LI Peng, ZHANG Zihao, HAN Yahong. Primitive Dynamic Weighting for Multi-modal Salient Object Detection [J]. Computer Science, 2026, 53(6): 242-251. |
| [15] | LIU Jikang, HUANG Lei, ZHANG Ke, NIE Jie, WEI Zhiqiang. Object Detection Method Based on Dynamic Feature Fusion [J]. Computer Science, 2026, 53(6): 263-269. |
|
||