Computer Science ›› 2023, Vol. 50 ›› Issue (4): 397-403.doi: 10.11896/jsjkx.220100232
• Interdiscipline & Frontier • Previous Articles
HUANG Rongfeng1,2, LIU Shifang1,2, ZHAO Yonghua1
CLC Number:
[1]ABDELFATTAH A,BABOULIN M,DOBREV V,et al.High-performance Tensor Contractions for GPUs[J].Procedia Computer Science,2016,80(1):108-118. [2]MOLERO J M,GARZÓN E M,GARCÍA I,et al.Efficient Implementation of Hyperspectral Anomaly Detection Techniques on GPUs and Multicore Processors[J].IEEE Journal of Selec-ted Topics in Applied Earth Observations and Remote Sensing,2014,7(6):2256-2266. [3]VILLA O,GAWANDE N,TUMEO A.Accelerating Subsurface Transport Simulationon Heterogeneous Clusters[C]//International Conference on Cluster Computing.New York:IEEE Press,2013:1-8. [4]ZHANG T,LIU X,WANG X,et al.cuTensor-Tubal:Efficient Primitives for Tubal-Rank Tensor Learning Operations on GPUs[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(3):595-610. [5]DONG T,HAIDAR A,TOMOV S,et al.A Fast Batched Cho-lesky Factorization on a GPU[C]//International Conference on Parallel Processing.New York:IEEE Press,2014:432-440. [6]ABDELFATTAH A,HAIDAR A,TOMOV S,et al.Factorization and Inversion of a Million Matrices using GPUs:Challenges and Countermeasures[C]//International Conference on Computational Science.Berlin:Springer,2017:606-615. [7]ABDELFATTAH A,COSTA T,DONGARRA J,et al.A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines[J].ACM Transactions on Mathematical Software,2021,47(7):1-23. [8]FRANCIS J G F.The QR Transformation:A Unitary Analogue to The LR Transformation—Part 1[J].The Computer Journal,1961,4(3):265-271. [9]FRANCIS J G F.The QR Transformation—Part 2[J].TheComputer Journal,1964,4(4):332-345. [10]CUPPEN J.A Divide and Conquer Method for The Symmetric Eigenproblem[J].Numerical Mathematics,1980,36(2):177-195. [11]GU M,EISENSTAT S C.A Divide-and-Conquer Algorithm for the Symmetric Tridiagonal Eigenproblem[J].SIAM Journal on Matrix Analysis and Applications,1995,16(1):172-191. [12]BIENTINESI P,DHILLON I S,GEIJN R A V D.A Parallel Eigensolver for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations[J].SIAM Journal on Scientific Computing,2005,27(1):43-66. [13]WILLEMS P R,LANG B.A Framework for The MR3 Algo-rithm:Theory and Implementation[J].SIAM Journal on Scientific Computing,2013,35(2):A740-A766. [14]WANG T,GUO L,LI G,et al.Implementing the Jacobi Algorithm for Solving Eigenvalues of Symmetric Matrices with CUDA[C]//International Conference on Networking,Architecture,and Storage.New York:IEEE Press,2012:69-78. [15]TORUN M U,YILMA O,AKANSU A N.FPGA,GPU,andCPU Implementations of Jacobi Algorithm for Eigenanalysis[J].Journal of Parallel & Distributed Computing,2016,96(12):172-180. [16]YU W Z,MOUSSA J,KUS P,et al.GPU-Acceleration of The ELPA2 Distributed Eigensolver for Dense Symmetric and Hermitian Eigenproblems[J].Computer Physics Communications,2021,262(5):107808. [17]WILLIAMS S,WATERMAN A,PATTERSON D.Roofline:AnInsightful Visual Performance Model for Multicore Architectures[J].Communication of the ACM,2009,52(4):65-76. [18]BRENT R P,LUK F T.The Solution of Singular Value and Symmetric Eigenvalue Problems on Multiprocessor Arrays[J].SIAM Journal on Scientific and Statistical Computing,1985,6(1):69-84. [19]RIVERA C,CHEN J,XIONG N,et al.TSM2X:High-Perfor-mance Tall-and-Skinny Matrix-Matrix Multiplication on GPUs[J].Journal of Parallel and Distributed Computing,2021,151(3):70-85. [20]ABDELFATTAH A,HAIDAR A,TOMOV S,et al.Perfor-mance,Design,and Autotuning of Batched GEMM For GPUs[C]//International Conference on High Performance Computing.Berlin:Springer,2016:21-38. [21]ABDELFATTAH A,HAIDAR A,TOMOV S,et al.Perfor-mance tuning and optimization techniques of fixed and variable size batched Cholesky factorization on GPUs[J].Procedia Computer Science,2016,80(C):119-130. |
[1] | LIU Lin-yun, CHEN Kai-yan, LI Xiong-wei, ZHANG Yang, XIE Fang-fang. Overview of Side Channel Analysis Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 296-302. |
[2] | WANG Yi-chao, LIAO Qiu-cheng, ZUO Si-cheng, XIE Rui, LIN Xin-hua. Performance Evaluation of ARM-ISA SoC for High Performance Computing [J]. Computer Science, 2019, 46(8): 95-99. |
[3] | ZHANG Biao, DONG Meng-yu, FAN Bei-bei. Enterprise Performance Evaluation Model Based on Triangular Fuzzy Multi-attribute Decision Making [J]. Computer Science, 2019, 46(6A): 547-549. |
[4] | LI Meng-xiao, YAO Shi-yuan. Design and Improvement of Face Recognition System Based on PCA [J]. Computer Science, 2019, 46(6A): 577-579. |
[5] | DING Wei-long, XUE Li-li, CHEN Wan-jun, WU Fu-li. Visualization Method for University Teachers’ Performance Data Based on Mixed Layout Strategy [J]. Computer Science, 2019, 46(2): 24-29. |
[6] | DONG Si-qi, LI Hai-long, QU Yu-ben, ZHANG Zhao, HU Lei. Survey of Research on Computation Unloading Strategy in Mobile Edge Computing [J]. Computer Science, 2019, 46(11): 32-40. |
[7] |
LUO Shu-yan, ZHU Yi-an, ZENG Cheng.
Performance Evaluation and Optimization of Inter-cores Communication for Heterogeneous Multi-core Processor Unit [J]. Computer Science, 2018, 45(6A): 262-265. |
[8] | ZHANG Shao-nan, QIU Ke-ni, ZHANG Wei-gong, WANG Jing, ZHENG Jia-xin, BAI Rui-ying and ZHU Xiao-yan. Queuing Theory-guided Performance Evaluation on Reconfigurable High-speed Device Connected Bus [J]. Computer Science, 2017, 44(Z6): 504-509. |
[9] | LI Hong-jun, CUI Xi-ning, MU Ming and HAN Wei. Research on Distributed Embedded Computer Performance Evaluation Model [J]. Computer Science, 2017, 44(4): 153-156. |
[10] | LIN Xin-hua, QIN Qiang, LI Shuo, WEN Min-hua and MATSUOKA Satoshi. Evaluating Intel AVX2 Vgather Instructions with Stencils [J]. Computer Science, 2017, 44(1): 20-24. |
[11] | WANG Wei, WANG Jia-jun, WANG Ming-ming, ZHANG Wen-jing and CHEN Jin-guang. Defense Technology Based on Dynamic Space-Time Performance for Flooding Attacks in Mobile Ad Hoc Networks [J]. Computer Science, 2017, 44(1): 159-166. |
[12] | SHEN Hua, HE Yan-xiang and ZHANG Ming-wu. Performance Bottlenecks Location Scheme Based on Structural Features of Service Composition Model [J]. Computer Science, 2015, 42(9): 107-117. |
[13] | ZHANG Cheng-wei, CHENG Wen-qing and HEI Xiao-jun. Measurement Study of 3G Mobile Networks Using Android Platform [J]. Computer Science, 2015, 42(2): 24-28. |
[14] | LI Zhi-jia, HU Xiang, JIAO Li and WANG Wei-feng. Performance Evaluation of Job Scheduling and InfiniBand Network Interconnection in High Performance Computing System Based on Stochastic Petri Nets [J]. Computer Science, 2015, 42(1): 33-37. |
[15] | ZHOU Xiao-peng,ZHANG Xiao-fang and ZHAO Xiao-nan. Research of Performance Evaluation of Cloud Storage [J]. Computer Science, 2014, 41(4): 190-194. |
|