Computer Science ›› 2022, Vol. 49 ›› Issue (10): 27-35.doi: 10.11896/jsjkx.220100125
• High Perfonnance Computing • Previous Articles Next Articles
CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
CLC Number:
[1]TODD R,GENNADY F.Predicts 2015:Introduction to Conditional Numerical Reproducibility(CNR)[OL].(2012-09-05) [2019-11-03].https://software.intel.com/content/www/us/en/develop/articles/introduction-to-the-conditional-numerical-reproducibility-cnr.html. [2]NVIDIA.NVIDIARcuBLAS[OL].[2021-08-02].https://docs.nvida.com/cuda/cublas/index.html. [3]PETER A,HONG D N,JAMES D.Efficient ReproducibleFloating Point Summation and BLAS [R].Berkeley:Electrical Engineering and Computer Sciences,2016. [4]PETER A.Reproducible Parallel Matrix-Vector Multiply [R].Los Alamos:CS 267 Final Report,2015. [5]JAMES D,HONG D N.Fast reproducible floating-point summation [C]//21st IEEE Symposium on Computer Arithmetic.2013:163-172. [6]JAMES D,HONG D N.Parallel Reproducible Summation [J].IEEE Transactions on Computer,2015,64(7):2060-2070. [7]ROMAN L,SYLVAIN C,DAVID D,et al.ExBLAS:Reproducible and Accurate BLAS library[R].Austin,TX:NRE2015 at SC15,2015. [8]ROMAN I,SYLVAIN C,DAVID D,et al.Reproducible triangular solvers for high-performance computing[C]//17th International Conference on Information Technology-New Generations(ITNG 2020).Las Vegas:Springer,2020:353-358. [9]SYLVAIN C,DAVID D,STEF G,et al.Numerical Reprodu-cibility for the Parallel Reduction on Multi-and Many-Core Architectures[J].Parallel Computing,2015,49(4):83-97. [10]KATSUHISA O,TAKESHI O,SIEGFRIED M P,et al.Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications [J].Numerical Algorithms,2012,59(1):95-118. [11]RUMP S,TAKESHI O,SHIN'ICHI O.Accurate Floating-Point Summation Part II:Sign,K-Fold Faithful and Rounding to Nearest [J].SIAM Journal on Scientific Computing,2008,31(1):1269-1302. [12]DAICHI M,TAKESHI O,KATSUHISA O.Accurate and Reproducible BLAS Routines with Ozaki Scheme for Many-core Architectures[C]//Parallel Processing and Applied Mathema-tics.Lublin:Springer,2019:516-527. [13]ANSI/IEEE standard 754-1985,IEEE Standard for BinaryFloating-point Arithmetic[S].New York:Institute of Electrical and Electronics Engineers,Inc,1985. [14]KNUTH D E.The Art of Computer Programming:Seminumerical Algorithms(3rd edition) [M].USA:Addison-Wesly,Rea-dingMA,1998. [15]KAHAN W.Pracniques:further remarks on reducing truncation errors [J].Communications of the ACM,1965,8(1):40-42. [16]DEKKER T J.A floating-point technique for extending the avai-lable precision [J].Numerische Mathematik 1971,18(3):224-242. [17]TAKESHI O,RUMP S,SHIN'ICHI O.Accurate sum and dot product [J].SIAM Journal on Scientific Computing,2005,26(6):1955-1988. [18]ZHANG X,WANG Q,ZHANG Y Q.Model-driven level 3BLAS Performance Optimization on Loongson 3A Processor[C]//2012 IEEE 18th International Conference on Parallel and Distributed Systems(ICPADS).Singapore:IEEE Computer Society,2012:684-691. [19]DAVID R B.Programming with POSIX Threads [M].United States:Addision-Wesley Professional,1997:33-36. [20]ZHU Y K,HAYES W B.Correct Rounding and hybrid approach to exact floating-point summation [J].SIAM Journal on Scientific Computing,2009,31(4):2981-3001. |
[1] | HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78. |
[2] | WU Gong-xing, Sun Zhao-yang, JU Chun-hua. Closed-loop Supply Chain Network Design Model Considering Interruption Risk and Fuzzy Pricing [J]. Computer Science, 2022, 49(7): 220-225. |
[3] | FU Si-qing, LI Tie-jun, ZHANG Jian-min. Architecture Design for Particle Transport Code Acceleration [J]. Computer Science, 2022, 49(6): 81-88. |
[4] | Jeffrey ZHENG. Meta Knowledge Intelligent Systems on Resolving Logic Paradoxes [J]. Computer Science, 2022, 49(1): 9-16. |
[5] | LIU Wei, RUAN Min-jie, SHE Wei, ZHANG Zhi-hong, TIAN Zhao. PBFT Optimized Consensus Algorithm for Internet of Things [J]. Computer Science, 2021, 48(11): 151-158. |
[6] | CUI Guo-nan, WANG Li-song, KANG Jie-xiang, GAO Zhong-jie, WANG Hui, YIN Wei. Fuzzy Clustering Validity Index Combined with Multi-objective Optimization Algorithm and Its Application [J]. Computer Science, 2021, 48(10): 197-203. |
[7] | FU Tian-hao, TIAN Hong-yun, JIN Yu-yang, YANG Zhang, ZHAI Ji-dong, WU Lin-ping, XU Xiao-wen. Performance Skeleton Analysis Method Towards Component-based Parallel Applications [J]. Computer Science, 2021, 48(6): 1-9. |
[8] | GUO Biao, TANG Qi, WEN Zhi-min, FU Juan, WANG Ling, WEI Ji-bo. List-based Software and Hardware Partitioning Algorithm for Dynamic Partial Reconfigurable System-on-Chip [J]. Computer Science, 2021, 48(6): 19-25. |
[9] | YU Jian-ye, QI Yong, WANG Bao-zhuo. Distributed Combination Deep Learning Intrusion Detection Method for Internet of Vehicles Based on Spark [J]. Computer Science, 2021, 48(6A): 518-523. |
[10] | ZHANG Hang, TANG Dan, CAI Hong-liang. Study on Predictive Erasure Codes in Distributed Storage System [J]. Computer Science, 2021, 48(5): 130-139. |
[11] | E Hai-hong, ZHANG Tian-yu, SONG Mei-na. Web-based Data Visualization Chart Rendering Optimization Method [J]. Computer Science, 2021, 48(3): 119-123. |
[12] | WANG Yan, HAN Xiao, ZENG Hui, LIU Jing-xin, XIA Chang-qing. Task Migration Node Selection with Reliable Service Quality in Edge Computing Environment [J]. Computer Science, 2020, 47(10): 240-246. |
[13] | WANG Zhe, TANG Qi, WANG Ling, WEI Ji-bo. Joint Optimization Algorithm for Partition-Scheduling of Dynamic Partial Reconfigurable Systems Based on Simulated Annealing [J]. Computer Science, 2020, 47(8): 26-31. |
[14] | WANG Guo-peng, YANG Jian-xin, YIN Fei, JIANG Sheng-jian. Computing Resources Allocation with Load Balance in Modern Processor [J]. Computer Science, 2020, 47(8): 41-48. |
[15] | ZHUANG Yi, YANG Jia-hai. Multi-source Tree-based Scheduling Algorithm for Deadline-aware P2MP Inter-datacenter Transfers [J]. Computer Science, 2020, 47(7): 213-219. |
|