计算机科学 ›› 2020, Vol. 47 ›› Issue (8): 87-92.doi: 10.11896/jsjkx.191000011
所属专题: 高性能计算
庄园, 郭强, 张洁, 曾云辉
ZHUANG Yuan, GUO Qiang, ZHANG Jie, ZENG Yun-hui
摘要: 随着超级计算机及其编程环境的发展, 异构系统结构下的多级并行编程将成为趋势, 神威·太湖之光国产超级计算机就是其中的一个典型。自2016年神威·太湖之光运行以来, 国内外很多学者在其上进行了方法研究和应用验证, 为申威环境积累了比较丰富的众核化编程方法及优化方法。但是, 将全球系统模式CESM移植到申威众核环境时, 对于海洋分量模式POP中的一些二维数据计算, 常用的众核优化方法在1024进程规模下运行时具有较好的加速效果, 然而在16800大规模进程下运行时众核化会失效, 表现为负加速。针对上述问题, 文中提出了一种基于从核分区的并行计算方法, 一个核组内的64个从核被分成多个互不交叉的从核分区, 将可以独立计算的多个代码段计算任务分别分配到不同的从核分区上进行运行, 能够有效利用从核的计算能力, 还可以实现对多个独立的代码段进行计算时间隐藏。每个从核分区内的从核数量及从核号可以根据拟分配的计算任务情况进行适当选取, 使得每个从核都能达到较适宜的数据量和计算量。在采用前述从核分区方法的基础上, 结合使用循环合并和函数上提等方法增大程序并行粒度, 提高了二维数据计算在大规模进程下的可扩展性, CESM模式高分辨率G算例中POP分量模式在110万核心规模下的模拟速度提高了0.8模式年/天, 众核化的加速效果明显。
中图分类号:
[1]WAN X Q, LIU Z D, SHEN B, et al.Introduction to the Community Earth System Model and Application to High Perfor-mance Computing [J].Advance in Earth Science, 2014, 29(4):482-491. [2]FU H, LIAO J, XUE W, et al.Refactoring and Optimizing the Community Atmosphere Model (CAM) on the Sunway TaihuLight Supercomputer.Supercomputing Conference[C]∥Salt Lake City, Utah, USA.2016:69-980. [3]FU H H, LIAO J F, DING N, et.al.Redesigning CAM-SE for Petascale Climate Modeling Performance on Sunway TaihuLight[C]∥Supercomputing Conference.Denver, USA, 2017. [4]SMITH R D, GENT P.Reference manual for the Parallel Ocean Program (POP)[R].Los Alamos Unclassified Report LA-UR-02-2484, 2002. [5]ZHANG L L, ZHAO J, WU J P, et al.Parallel Computing of POP Ocean Model on Quad-Core Intel Xeon Cluster [J].Computer Engineering and Application, 2009, 45(5):189-192. [6]GUO S, DOU Y, LEI Y W.GPU Parallel Optimization of the Oceanic General Circulation Model POP [J].Computer Engineering and Science, 2012, 34(8):147-153. [7]ZHAO W, LEI X Y, CHEN D X, et al.Porting and Application of Global Eddy-Resolving Parallel Ocean Mode POP to SW Supercomputer [J].Computer Application and Software, 2014, 31(5):42-45. [8]WU Q, NI Y F, HUANG X M.Regional Ocean Model Parallel Optimization in “Sunway TaiHuLight ”[J].Journal of Computer Research and Development, 2019, 56(7):1556-1566. [9]DUAN X H, GAO P, ZHANG T J, et al.Redesigning LAMMPS for Peta-Scale and Hundred-Billion-Atom Simulation on Sunway TaihuLight [C]∥Supercomputing Conference.Dallas, Texas, USA, 2018:148-159. [10]LIN H, ZHU X W, YU B W, et al.ShenTu:Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds [C]∥Supercomputing Conference.Dallas, Texas, USA, 2018:706-716. [11]LIU X, GUO H, SUN R J, et al.The Characteristic Analysis and Exascale Suggestions of Large Scale Parallel Applications on Sunway TaiHuLight Supercomputer[J].Chinese Journal of Computers, 2018, 41(10):2209-2220. [12]LI F, LI Z H, XU J X, et al.Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System[J].Computer Science, 2020, 47(1):24-30. [13]ZHOU Y.Implementation and optimization of Lattice QCD numerical simulation based on the Sunway platform[D].Hangzhou:Zhejiang University, 2019. [14]LIU K, WANG X L, XU P, et al.A Parallel Tridiagonal Solver on Sunway Many-core Processors[C]∥HPC China.2018. [15]SU Z C.Design and Implement of a Dataflow ProgrammingModel on [D].Hefei:University of Science and Technology of chine, 2018. [16]WANG X.Study of the Parallel Physical Optics on Sunway Platform[D].Xi’an:Xidian University.2016. [17]FU H H, HE C H, CHEN B W, et al.18.9-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight:Enabling Depiction of 18-Hz and 8-Meter Scenarios[C]∥Supercomputing Conference.Denver, USA.(ACM Gordon Bell Prize), 2017:1-12. |
[1] | 陈志强, 韩萌, 李慕航, 武红鑫, 张喜龙. 数据流概念漂移处理方法研究综述 Survey of Concept Drift Handling Methods in Data Streams 计算机科学, 2022, 49(9): 14-32. https://doi.org/10.11896/jsjkx.210700112 |
[2] | 王明, 武文芳, 王大玲, 冯时, 张一飞. 生成链接树:一种高数据真实性的反事实解释生成方法 Generative Link Tree:A Counterfactual Explanation Generation Approach with High Data Fidelity 计算机科学, 2022, 49(9): 33-40. https://doi.org/10.11896/jsjkx.220300158 |
[3] | 张佳, 董守斌. 基于评论方面级用户偏好迁移的跨领域推荐算法 Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer 计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131 |
[4] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[5] | 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲. 基于无监督集群级的科技论文异质图节点表示学习方法 Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level 计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196 |
[6] | 柴慧敏, 张勇, 方敏. 基于特征相似度聚类的空中目标分群方法 Aerial Target Grouping Method Based on Feature Similarity Clustering 计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203 |
[7] | 郑文萍, 刘美麟, 杨贵. 一种基于节点稳定性和邻域相似性的社区发现算法 Community Detection Algorithm Based on Node Stability and Neighbor Similarity 计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146 |
[8] | 吕晓锋, 赵书良, 高恒达, 武永亮, 张宝奇. 基于异质信息网的短文本特征扩充方法 Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network 计算机科学, 2022, 49(9): 92-100. https://doi.org/10.11896/jsjkx.210700241 |
[9] | 徐天慧, 郭强, 张彩明. 基于全变分比分隔距离的时序数据异常检测 Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance 计算机科学, 2022, 49(9): 101-110. https://doi.org/10.11896/jsjkx.210600174 |
[10] | 聂秀山, 潘嘉男, 谭智方, 刘新放, 郭杰, 尹义龙. 基于自然语言的视频片段定位综述 Overview of Natural Language Video Localization 计算机科学, 2022, 49(9): 111-122. https://doi.org/10.11896/jsjkx.220500130 |
[11] | 曹晓雯, 梁美玉, 鲁康康. 基于细粒度语义推理的跨媒体双路对抗哈希学习模型 Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model 计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011 |
[12] | 周旭, 钱胜胜, 李章明, 方全, 徐常胜. 基于对偶变分多模态注意力网络的不完备社会事件分类方法 Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification 计算机科学, 2022, 49(9): 132-138. https://doi.org/10.11896/jsjkx.220600022 |
[13] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[14] | 曲倩文, 车啸平, 曲晨鑫, 李瑾如. 基于信息感知的虚拟现实用户临场感研究 Study on Information Perception Based User Presence in Virtual Reality 计算机科学, 2022, 49(9): 146-154. https://doi.org/10.11896/jsjkx.220500200 |
[15] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
|