异构混合并行计算综述

doi:10.11896/jsjkx.200600045

Abstract

Abstract: With the rapid increase in computing power demand of computer applications such as artificial intelligence and big data and the diversification of application scenarios, the research of heterogeneous hybrid parallel computing has become the focus of research.This paper introduces the current main heterogeneous computer architecture, including CPU/coprocessor, CPU/many-core processor, CPU/ASCI and CPU/FPGA heterogeneous architectures.The changes made by the heterogeneous hybrid parallel programming model with the development of various heterogeneous hybrid structures are briefly described, which is a transformation and re-implementation of an existing language, or an extension of an existing heterogeneous programming language, or heterogeneous programming using instructional statements, or container pattern collaborative programming.The analysis shows that the heterogeneous hybrid parallel computing architecture will further strengthen the support for AI, and will also enhance the versatility of the software.This paper reviewes the key technologies in heterogeneous hybrid parallel computing, including parallel task partitioning, task mapping, data communication, data access between heterogeneous processors, parallel synchronization of heterogeneous collaboration, and pipeline parallelism of heterogeneous resources.Based on these key technologies, this paper points out the challenges faced by heterogeneous hybrid parallel computing, such as programming difficulties, portability difficulties, large data communication overhead, complex data access, complex parallel control, and uneven resource load.The challenges faced by heterogeneous hybrid parallel computing are analyzed, and this paper concludes that the current key core technologies need to be integrated from general-purpose and AI-specific heterogeneous computing, seamless migration of heterogeneous architectures, unified programming model, integration of storage and computing, and intelligence breakthroughs in task division and allocation

Key words: Heterogeneous architecture, Heterogeneous computing, Heterogeneous hybrid programming, Heterogeneous parallel programming, Parallel computing

CLC Number:

TP301

YANG Wang-dong, WANG Hao-tian, ZHANG Yu-feng, LIN Sheng-le, CAI Qin-yun. Survey of Heterogeneous Hybrid Parallel Computing[J].Computer Science, 2020, 47(8): 5-16.

References

[1] GELADO I, KELM J H, RYOO S, et al.CUBA:an architecture for efficient CPU/coprocessor data communication∥Proceedings of the 22nd Annual International Conference on Supercomputing.2008:299-308.
[2] ROWEN C, JOHNSON M, RIES P.The MIPS R3010 floating-point coprocessor.IEEE Micro, 1988, 8(3):53-62.
[3] BREY B B.The Intel microprocessors:8086/8088, 80186/80188, 80286, 80386, 80486, Pentium, Pentium Pro processor, Pentium II, Pentium III, Pentium 4, and Core2 with 64-bit extensions:architecture, programming, and interfacing.Pearson Education India, 2009.
HINDS C N.An enhanced floating point coprocessor for embedded signal processing and graphics applications∥Conference Record of the Thirty-Third Asilomar Conference onSignals, Systems, and Computers (Cat.No.CH37020).IEEE, 1999, 1:147-151.
[5] SOHN J H, WOO J H, YOO J, et al.Design and test of fixed-point multimedia co-processor for mobile applications∥Proceedings of the Design Automation & Test in Europe Confe-rence.IEEE, 2006, 2:1-5.
Outline of the Development of the Post-K computer[EB/OL].https://www.r-ccs.riken.jp/en/postk/project/outline
[7] BARBALACE A, RAVINDRAN B, KATZ D.Popcorn:areplicated-kernel OS based on Linux∥Proceedings of the Linux Symposium.Ottawa, Canada, 2014.
[8] MLLER M, SPINCZYK O.MxKernel:Rethinking OperatingSystem Architecture for Many-core Hardware∥9th Workshop on Systemsfor Multi-core and Heterogenous Architectures.2019.
[9] AGGARWAL K, BONDHUGULA U.Optimizing the linear fascicle evaluation algorithm for many-core systems∥Procee-dings of the ACM International Conference on Supercomputing.2019:425-437.
HUHN W P, LANGE B, YU V W, et al.GPGPU acceleration of all-electron electronic structure theory using localized numeric atom-centered basis functions.arXiv:1912.06636.
[11]GUBNER T, TOM D, LANG H, et al.Fluid Co-processing:GPU Bloom-filters for CPU Joins∥Proceedings of the 15th International Workshop on Data Management on New Hardware.2019:1-10.
[12]NIE J, ZHANG C, ZOU D, et al.Adaptive Sparse Matrix-Vector Multiplication on CPU-GPU Heterogeneous Architecture∥Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference.2019:6-10.
[13]KHAIRY M, WASSAL A G, ZAHRAN M.A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity.Journal of Parallel and Distributed Computing, 2019, 127:65-88.
[14]BARAJAS C, GOBBERT M K, KROIZ G C, et al.Challenges and opportunities for the simulation of calcium waves onmodern-multi-core and many-core parallel computing platforms.International Journal for Numerical Methods in Biomedical Engineering.https://doi.org/10.1002/cnm.3244.
[15]SODANI A, GRAMUNT R, CORBAL J, et al.Knights landing:Second-generation intel xeon phi product.Ieee micro, 2016, 36(2):34-46.
[16]MAGAKI I, KHAZRAEE M, GUTIERREZ L V, et al.Asicclouds:Specializing the datacenter∥2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).IEEE, 2016:178-190.
[17]PENG Y, ZHU W, ZHAO Y.Cross-media analysis and reasoning:advances and directions[J].Frontiers of Information Technology & Electronic Engineering, 2017, 18(1):44-57.
[18]LI B, GU J, JIANG W.Artificial Intelligence (AI)Chip Technology Review∥2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI).IEEE, 2019:114-117.
[19]CHEN T, DU Z, SUN N, et al.Diannao:A small-footprint high-throughput accelerator for ubiquitous machine-learning.ACM SIGARCH Computer Architecture News, 2014, 42(1):269-284.
[20]MU R, ZENG X.A Review of Deep Learning Research.TIIS, 2019, 13(4):1738-1764.
[21]OVTCHAROV K, RUWASE O, KIM J Y, et al.Toward accelerating deep learning at scale using specialized hardware in the datacenter∥2015 IEEE Hot Chips 27 Symposium (HCS).IEEE Computer Society, 2015:1-38.
[22]HU L J, CHEN N G, LI J, et al.FPGA Heterogeneous Computing Platform and Its Application.Electric Power Information and Communication Technology, 2016, 14(7):6-11.
[23]STROMME A, CARLSON R, NEWHALL T.Chestnut:A Gpu programming language for non-experts∥Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores.2012:156-167.
[24]AUERBACH J, BACON D F, CHENG P, et al.Lime:a Java-compatible and synthesizable language for heterogeneous architectures∥Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications.2010:89-108.
[25]LINDERMAN M D, COLLINS J D, WANG H, et al.Merge:a programming model for heterogeneous multi-core systems.ACM SIGOPS Operating Systems Review, 2008, 42(2):287-296.
[26]CUDA.https://developer.nvidia.com/cuda-zone.
[27]HAN T D, ABDELRAHMAN T S.hiCUDA:a high-level directive-based language for GPU programming∥Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units.2009:52-61.
[28]BAKHTIN V A, KRYUKOV V A, CHETVERUSHKIN B N, et al.Extension of the DVM parallel programming model for clusters with heterogeneous nodes.Doklady Mathematics, 2011, 84(3):879-881.
[29]LEE S, VETTER J S.Moving Heterogeneous GPU Computing into the Mainstream with Directive-Based, High-Level Programming Models (Position Paper)∥DOE Exascale Research Conference.2012.
[30]The OpenCL standard[OL].https://www.khron os.org/opencl/.
[31]RASCH A, BIGGE J, WRODARCZYK M, et al.dOCAL:high-level distributed programming with OpenCL and CUDA.The Journal of Supercomputing, 2020, 76:5117-5138.
[32]WU S, DONG X, ZHANG X, et al.NoT:a high-level no-threading parallel programming method for heterogeneous systems.The Journal of Supercomputing, 2019, 75(7):3810-3841.
[33]PANDIT P, GOVINDARAJAN R.Fluidic kernels:Cooperativeexecution of opencl programs on multiple heterogeneous devices∥Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization.2014:273-283.
[34]C++ Accelerated Massive Parallelism[OL].https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2012/hh265137(v=vs.110)?redirectedfrom=MSDN.
[35]VIAS M, BOZKUS Z, FRAGUELA B B.Exploiting heterogeneous parallelism with the Heterogeneous Programming Library.Journal of Parallel and Distributed Computing, 2013, 73(12):1627-1638.
DE SUPINSKI B R, SCOGLAND T R W, DURAN A, et al.The ongoing evolution of openmp.Proceedings of the IEEE, 2018, 106(11):2004-2019.
[37]WANG X, LEIDEL J D, CHEN Y.OpenMP Memkind:An Extension for Heterogeneous Physical Memories∥2017 46th International Conference on Parallel Processing Workshops (ICPPW).IEEE, 2017:220-227.
[38]FUMERO J J, DE SANDE F.accull:An user-directed approach to heterogeneous programming∥2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.IEEE, 2012:654-661.
[39]LEE S, VETTER J S.OpenARC:open accelerator research compiler for directive-based, efficient heterogeneous computing∥Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing.2014:115-120.
[40]LEE S, VETTER J S.OpenARC:extensible OpenACC compiler framework for directive-based accelerator programming study∥2014 First Workshop on Accelerator Programming Using Directives.IEEE, 2014:1-11.
[41]ZHANG J, LU X, CHU C H, et al.C-GDR:High-Performance Container-aware GPUDirect MPI Communication Schemes on RDMA Networks∥2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).IEEE, 2019:242-251.
[42]CHEN Y W, HUNG S H, TU C H, et al.Virtual hadoop:Mapreduce over docker containers with an auto-scaling mechanism for heterogeneous environments∥Proceedings of the International Conference on Research in Adaptive and Convergent Systems.2016:201-206.
[43]MAO Y, OAK J, POMPILI A, et al.Draps:Dynamic and re-source-aware placement scheme for docker containers in a hetero-geneous cluster∥2017 IEEE 36th InternationalPerfor-mance Computing and Communications Conference (IPCCC).IEEE, 2017:1-8.
[44]YANG W, LI K, LI K.A hybrid computing method of SpMV on CPU-GPU heterogeneous computing systems.Journal of Parallel and Distributed Computing, 2017, 104:49-60.
[45]HOSSEINABADY M, NUNEZ-YANEZ J.Sparse Matrix-Dense Matrix Multiplication on Heterogeneous CPU+ FPGA Embedded System∥Proceedings of the 11th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures/9th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms.2020:1-6.
[46]KOBAYASHI R, FUJITA N, YAMAGUCHI Y, et al.GPU-FPGA Heterogeneous Computing with OpenCL-Enabled Direct Memory Access∥2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).IEEE, 2019:489-498.
[47]QUAN Z, WANG Z J, YE T, et al.Task Scheduling for Energy Consumption Constrained Parallel Applications on Heterogeneous Computing Systems.IEEE Transactions on Parallel and Distributed Systems, 2019, 31(5):1165-1182.
[48]PECCERILLO B, BARTOLINI S.Task-DAG Support in Single-Source PHAST Library:Enabling Flexible Assignment of Tasks to CPUs and GPUs in Heterogeneous Architectures∥Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores.2019:91-100.
[49]ALEBRAHIM S, AHMAD I.Task scheduling for heterogeneous computing systems.The Journal of Supercomputing, 2017, 73(6):2313-2338.
[50]KELEFOURAS V, DJEMAME K.Workflow Simulation Aware and Multi-Threading Effective Task Scheduling for Heterogeneous Computing∥2018 IEEE 25th International Conference on High Performance Computing (HiPC).IEEE, 2018:215-224.
[51]KUMAR N, MAYANK J, MONDAL A.Reliability aware Energy Optimized Scheduling of Non-preemptive Periodic Real-Time Tasks on Heterogeneous Multiprocessor System.IEEE Transactions on Parallel and Distributed Systems, 2019, 31(4):871-885.
[52]CRUZ E H M, DIENER M, PILLA L L, et al.EagerMap:a task mapping algorithm to improve communication and load balancing in clusters of multicore systems.ACM Transactions on Parallel Computing (TOPC), 2019, 5(4):1-24.
[53]CRUZ E H M, DIENER M, PILLA L L, et al.An efficient algorithm for communication-based task mapping∥2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.IEEE, 2015:207-214.
[54]BOSCH J, VIDAL M, FILGUERAS A, et al.Breaking master-slave model between host and FPGAs∥Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.2020:419-420.
[55]LI A, SONG S L, CHEN J, et al.Evaluating Modern GPU Interconnect:PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect.IEEE Transactions on Parallel and Distributed Systems, 2019, 31(1):94-110.
[56]SHUI C, YU X, YAN Y, et al.Revisiting linpack algorithm on large-scale CPU-GPU heterogeneous systems∥Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.2020:411-412.
[57]LIANG L, ZHANG Q, SONG P, et al.Overlapping communication and computation of GPU/CPU heterogeneous parallel spatial domain decomposition MOC method.Annals of Nuclear Energy, 2020, 135:106988.
[58]ZHANG J, JUNG M.An in-depth performance analysis of ma-ny-integrated core for communication efficient heterogeneous computing∥IFIP International Conference on Network and Parallel Computing.Cham:Springer, 2017:155-159.
[59]HU Y, YANG H, LUAN Z, et al.Massively scaling seismic processing on sunway taihulight supercomputer.IEEE Transactions on Parallel and Distributed Systems, 2019, 31(5):1194-1208.
[60]ZHENG T, NELLANS D, ZULFIQAR A, et al.Towards high performance paged memory for GPUs∥2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).IEEE, 2016:345-357.
[61]DAI H, LIN Z, LI C, et al.Accelerate GPU concurrent kernel execution by mitigating memory pipeline stalls∥2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).IEEE, 2018:208-220.
[62]GANGULY D, ZHANG Z, YANG J, et al.Interplay betweenhardware prefetcher and page eviction policy in CPU-GPU unified virtual memory∥Proceedings of the 46th International Symposium on Computer Architecture.2019:224-235.
[63]YU L, CHEN T, WU M, et al.Last level cache layout remapping for heterogeneous systems.Journal of Systems Architecture, 2018, 87:49-63.
[64]RAWAT P S, RASTELLO F, SUKUMARAN-RAJAM A, et al.Register optimizations for stencils on GPUs∥Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.2018:168-182.
[65]NELSON J, PALMIERI R.Don’t Forget About Synchronization! A Case Study of K-Means on GPU∥Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores.2019:11-20.
[66]NIDAW B Y, OH M H, KIM Y W.Appropriate Synchronization Time Allocation for Distributed Heterogeneous Parallel Computing Systems.KSII Transactions on Internet & Information Systems, 2019, 13(11).
[67]OH C, ZHENG Z, SHEN X, et al.GOPipe:a granularity-oblivious programming framework for pipelined stencil executions on GPU∥Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming.2019:431-432.
[68]ZHANG P, FANG J, YANG C, et al.Optimizing Streaming Pa-rallelism on Heterogeneous Many-Core Architectures.IEEE Transactions on Parallel and Distributed Systems, 2020, 31(8):1878-1896.
[69]ZHENG Z, OH C, ZHAI J, et al.HiWayLib:A Software Framework for Enabling High Performance Communications for Heterogeneous Pipeline Computations∥Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems.2019:153-166.
[70]FANG X D.Research on CPU GPU heterogeneous parallel technology for large-scale scientific computing .Changsha:National University of Defense Technology, 2009.
[71]MICHALAKES J, VACHHARAJANI M.GPU Acceleration of NWP:Benchmark Kernels.http://www.inmm.ucar.edu/wrf/WG2/GPU.2009-02-25.
[72]SARKAR S, ALAVANI G.How Easy it is to Write Software for Heterogeneous Systems?.ACM SIGSOFT Software Engineering Notes, 2018, 42(4):1-7.
[73]AGULLO M, DEMMEL J, DONGARRA J, et al.Numericallinear algebra on emerging architectures:the PLASMA and MAGMA projects .Journal of Physics:Conference Series, 2009, 180(1):012037.
[74]LTAIEF H, TOMOV S, NATH R, et al.A Sealable High Performant Cholesky Factorization for Multicore with GPU Acce-lerators ∥International Conference on High Performance Computing for Computational Science.Berlin:Springer, 2010:93-101.
[75]LU F, SONG J, YIN F, et al.Performance evaluation of hybrid programming patterns for large CPU/GPU heterogeneous clusters.Computer Physics Communications, 2012, 183(6):1172-1181.
[76]STONE J E, GOHARA D, SHI G.OpenCL:A Parallel Pro-gramming Standard for Heterogeneous Computing Systems.Computing in Science & Engineering, 2010, 12(3):66-73.
[77]HAN T D, ABDELRAHMAN T S.hiCUDA:High-Level GPGPU Programming.IEEE Transactions on Parallel & Distri-buted Systems, 2011, 22(1):78-90.
[78]LIU X Y, ZHAO Q, NIE W.Research on Computer Image Video Processing from the Perspective of C++AMP.China Computer & Communication, 2018(21):29.
[79]XIAO S.Generalizing the Utility of Graphics Processing Units in Large-Scale Heterogeneous Computing Systems.Blacksburg:Virginia Tech, 2013.
[80]LIUY, LU F, WANG L, et al.Research on Heterogeneous Parallel Programming Model.Journal of Software, 2014, 25(7):1459-1475.
[81]GODDEKE D, WOBKER H, STRZODKA R, et a1.Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU.International Journal of Computational Science and Engineering, 2009, 4(4):254-269.
[82]KALIDAS R, DAGA M, KROMMYDAS K, et al.On the Performance, Energy, and Power of Data-Access Methods in Heterogeneous Computing Systems∥ IEEE International Parallel &Distributed Processing Symposium Workshop.IEEE, 2015.
[83]YUN K Y.Synthesis of asynchronous controllers for heterogeneous systems.Standford:Stanford University, 1994.
[84]NVIDIA Corporation.CUDA C programming guide(Version 5)[Z].2013.
[85]ANDRONIKOS T, CIORBA F M, RIAKIOTAKIS I, et al.Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems.Perfor-mance Evaluation, 2010, 67(12):1324-1339.
[86]ZHONG Z, RYCHKOV V, LASTOVETSKY A.Data partitioning on heterogeneous multicore and multi-GPU systems using functional performance models of data-parallel applications∥2012 IEEE International Conference on Cluster Computing.IEEE, 2012:191-199.
[87]YANG W, LI K, LI K.A hybrid computing methodof SpMV on CPU-GPU heterogeneous computing systems.Journal of Parallel and Distributed Computing, 2017, 104(JUN.):49-60.
[88]ZHONG Z, RYCHKOV V, LASTOVETSKY A.Data Partitioning on Multicore and Multi-GPU Platforms Using Functional Performance Models.IEEE Transactions on Computers, 2015, 64(9):2506-2518.
[89]NEETESH K, PRAKASH V D.A Hybrid Heuristic for Load-Balanced Scheduling of Heterogeneous Workload on Heterogeneous Systems.The Computer Journal, 2019, 62(2):276-291.
[90]BARAGLIA R, FERRINI R, RITROVATO P.A static mapping heuristics to map parallel applications to heterogeneous computing systems.Concurrency & Computation Practice & Experience, 2005, 17(13):1579-1605.
[91]ITURRIAGA S, NESMACHNOW S, LUNA F, et al.A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems.Journal of Supercomputing, 2015, 71(2):648-672.

YANG Wang-dong, doctor, professor.His main research interests include high performance computing and parallel computing.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

[1]	CHEN Xin, LI Fang, DING Hai-xin, SUN Wei-ze, LIU Xin, CHEN De-xun, YE Yue-jin, HE Xiang. Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture [J]. Computer Science, 2022, 49(6): 99-107.
[2]	FU Tian-hao, TIAN Hong-yun, JIN Yu-yang, YANG Zhang, ZHAI Ji-dong, WU Lin-ping, XU Xiao-wen. Performance Skeleton Analysis Method Towards Component-based Parallel Applications [J]. Computer Science, 2021, 48(6): 1-9.
[3]	HE Ya-ru, PANG Jian-min, XU Jin-long, ZHU Yu, TAO Xiao-han. Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform [J]. Computer Science, 2021, 48(6): 34-40.
[4]	LI Fan, YAN Xing, ZHANG Xiao-yu. Optimization of GPU-based Eigenface Algorithm [J]. Computer Science, 2021, 48(4): 197-204.
[5]	HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li. Parallel WMD Algorithm Based on GPU Acceleration [J]. Computer Science, 2021, 48(12): 24-28.
[6]	XIE Jing-ming, HU Wei-fang, HAN Lin, ZHAO Rong-cai, JING Li-na. Quantum Fourier Transform Simulation Based on “Songshan” Supercomputer System [J]. Computer Science, 2021, 48(12): 36-42.
[7]	MA Meng-yu, WU Ye, CHEN Luo, WU Jiang-jiang, LI Jun, JING Ning. Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data [J]. Computer Science, 2020, 47(9): 117-122.
[8]	CHEN Guo-liang, ZHANG Yu-jie, . Development of Parallel Computing Subject [J]. Computer Science, 2020, 47(8): 1-4.
[9]	ZHANG Long-xin, ZHOU Li-qian, WEN Hong, XIAO Man-sheng, DENG Xiao-jun. Energy Efficient Scheduling Algorithm of Workflows with Cost Constraint in Heterogeneous Cloud Computing Systems [J]. Computer Science, 2020, 47(8): 112-118.
[10]	YANG Zong-lin, LI Tian-rui, LIU Sheng-jiu, YIN Cheng-feng, JIA Zhen, ZHU Jie. Streaming Parallel Text Proofreading Based on Spark Streaming [J]. Computer Science, 2020, 47(4): 36-41.
[11]	DENG Ding-sheng. Application of Improved DBSCAN Algorithm on Spark Platform [J]. Computer Science, 2020, 47(11A): 425-429.
[12]	JIANG Ze-tao, XU Juan-juan. Efficient Heterogeneous Cross-domain Authentication Scheme Based on Proxy Blind Signature in Cloud Environment [J]. Computer Science, 2020, 47(11): 60-67.
[13]	XU Chuan-fu,WANG Xi,LIU Shu,CHEN Shi-zhao,LIN Yu. Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python [J]. Computer Science, 2020, 47(1): 17-23.
[14]	XU Lei, CHEN Rong-liang, CAI Xiao-chuan. Scalable Parallel Finite Volume Lattice Boltzmann Method Based on Unstructured Grid [J]. Computer Science, 2019, 46(8): 84-88.
[15]	SHU Na,LIU Bo,LIN Wei-wei,LI Peng-fei. Survey of Distributed Machine Learning Platforms and Algorithms [J]. Computer Science, 2019, 46(3): 9-18.

Survey of Heterogeneous Hybrid Parallel Computing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0