计算机科学 ›› 2019, Vol. 46 ›› Issue (11A): 528-534.

• 综合、交叉与应用 • 上一篇    下一篇

GRAPES_CUACE大气化学耦合模式并行优化

叶跃进1, 陈德训1, 胡江凯2, 马欣2, 张小曳3   

  1. (江南计算技术研究所 江苏 无锡214083)1;
    (中国气象局数值预报中心 北京100081)2;
    (中国气象科学研究院 北京100081)3
  • 出版日期:2019-11-10 发布日期:2019-11-20
  • 通讯作者: 胡江凯男,硕士,高级工程师,主要研究领域为气象数值预报系统,E-mail:hujk@cma.cn。
  • 作者简介:叶跃进男,硕士生,工程师,主要研究领域为并行算法与应用,E-mail:ye_ddr@foxmail.com。
  • 基金资助:
    本文受国家重点研发计划(2016YFC0203300),国家重大专项基金(2016YFA0602202,2017YFB0202603)资助。

Parallel Design and Optimization of GRAPES_CUACE On-line Coupled Air Quality Mode

YE Yue-jin1, CHEN De-xun1, HU Jiang-kai2, MA Xin2, ZHANG Xiao-ye3   

  1. (Jiangnan Institute of Computing Technology,Wuxi,Jiangsu 214083,China)1;
    (Numerical Weather Prediction Center of CMA,Beijing 100081,China)2;
    (Chinese Academy of Meteorological Sciences,Beijing 100081,China) 3
  • Online:2019-11-10 Published:2019-11-20

摘要: 文中主要介绍了数值天气预报模式GRAPES_MESO(4.0版本)与大气化学模式CUACE在线耦合形成的GRAPES_CUACE大气化学耦合模型在不同版本的x86体系结构下的并行优化算法的研究与分析。借鉴目前国内外主流的并行优化设计方法,结合GRAPES_MESO系统本身的程序架构和并行框架,针对不同版本x86体系架构做了相应的并行化改造。运用gprof工具和戳桩计时等方法,测试得到的程序热点模块主要有3部分:IO、通信和物理过程。对IO模块主要的优化方法为:1)由离散读写改为连续读写;2)开辟缓冲区由稀疏访存改为连续访存;3)异步IO。对通信部分采用两种方式:1)由细粒度改为粗粒度通信;2)采用时间复杂度更低的集合通信。对IO与通信模块优化结果分析可得:IO模块优化后的耗时占比由原来的43.7%降至1.41%,比重大幅度降低,最优部分性能提升了317倍,因此,该方法极大地提升了IO模块运行效率。此外,对物理过程进行优化采用的主要方法是:1)多层循环计算过程由离散改为连续;2)通信机制循环外移;3)数据复用以减少计算冗余;4)缩减栈变量空间等。这些优化方法使计算性能提高了22%,进一步提高了程序的并行效率和模式的强可扩展性。

关键词: 异步IO, 粗粒度, 连续访存, 集合通信

Abstract: This article mainly introduced the research and analysis of the parallel optimization algorithm of the meteorological particulate_meso dust aerosol coupling model under different versions of the x86 architecture.Drawing on the current mainstream parallel optimization design methods at home and abroad,combined with the GRAPES_MESO system’s own program architecture and parallel framework,corresponding parallelization transformation was implemented for different versions of x86 architecture.Using the gprof tool and poke pile timing,the test hotspot module has three main parts:IO,communication and physical process.The main optimization methods for the IO module are:1)continuous reading and writing by discrete reading and writing;2)opening buffer from sparse memory access to continuous memory access;3)asynchronous IO.The following methods are adopted for the communication part:1)the fine-grained communication is changed from fine-grained to coarse-grained;2)the aggregate communication with lower time complexity is adopted.Analysis of optimization results for IO and communication modules show that the time-consuming ratio of IO module optimization decreased from 43.7% to 1.41%.The proportion is greatly reduced,and the optimal performance is improved by 317 times.Therefore,the method described in this paper greatly improves the operating efficiency of the IO module.In addition,the main optimization methods used to optimize the physical process are as follows:1)the multi-layer cyclic calculation process is changed from discrete to continuous;2)the communication mechanism is cyclically shifted;3)the data is reused to reduce computational redundancy;4)the stack variable space is reduced.The computational performance is increased by 22%,which further improves the parallel efficiency of the program and the strong scalability of the model.

Key words: Asynchronous IO, Coarse-grained, Continuous memory access, Aggregate communicatio

中图分类号: 

  • TP302.7
[1]陈德辉,薛纪善,杨学胜,等.GRAPES 新一代全球/区域尺度统一数值预报模式总体设计研究[J].科学通报,2008,53:2396-2407.
[2]张涵斌,陈静,等.GRAPES区域集合预报应用研究[J].气象期刊,2014,40(9).
[3]GONG S L,ZHANG X Y.CUACE/Dust — An integrated system of observation and modeling systems for operational dust forecasting in Asia[J].Atomospheric Chemisty and Physics,2008,8:2333-2340.
[4]ZHANG X Y.Characterization of soil dust aerosol in China and its transport/distribution during 2001 ACE-Asia.1.Network Observations[J].Journal of Geophysical Research,2003,108(9):4261.
[5]ZHOU C H,GONG S L,ZHANG X Y,et al.Development and evaluation of an operational SDS forecasting system for East Asia:CUACE/Dust[J].Atomospheric Chemisty and Physics,2008,8(4):787-798.
[6]李曼,张载勇,李淑娟,等.CUACE 系统在乌鲁木齐空气质量预报中的效果检验[J].沙漠与绿洲气象,2014,8(5):63-68.
[7]WANG H,GONG S L,ZHANG H L,et al.A new-generation sand and dust storm forecasting system GRAPES_CUACE/Dust:Model development,verification and numerical simulation[J].Sci Bull.
[8]王宏,龚山陵,张红亮,等.新一代沙尘天气预报系统GRAPES_CUACE/Dust:模式建立、检验和数值模拟[J].科学通报,2009,54:3878-3891.
[9]AN X Q,ZHAI S X,JIN M,et al.Tracking influential hazesource areas in North China using an adjoint model,GRAPES-CUACE [J].Geosci.Model Dev.Discuss.,2015,8:7313-7345.
[10]WANG H,SHI G Y,LI W,et al.The impacts of optical properties onradiative forcing due to dust aerosol[J].Adv Atmos Sci ,2006,23:431-441.
[11]WANG H,ZHANG X Y,GONG S L,et al.Radiative feedback ofdust aerosolson the East Asian dust storms[J].J Geophys Res,2010,115:D23214.
[12]WANG H,ZHAO T L,ZHANG X Y,et al.Dust direct radiative affects onthe earth-atmosphere system over East Asia:Early spring cooling and late spring warming [J].Chinese Science Bull,2011,56:1020-1030.
[13]WANG H,SHI G Y,ZHU J,et al.Case study of longwave contribution to dust radiative effects over East Asia [J].Chinese Science Bull,2013,58:3673-3681.
[1] 董辉,郝小虎,张贵军. 蛋白质构象空间局部增强差分进化搜索方法[J]. 计算机科学, 2015, 42(Z11): 22-26.
[2] 郝小虎,张贵军,周晓根,程正华,张启鹏. 一种基于片段组装的蛋白质构象空间优化算法[J]. 计算机科学, 2015, 42(3): 237-240.
[3] 饶立,张云泉,李玉成. 国产百万亿次机群系统Alltoall性能测试与分析[J]. 计算机科学, 2010, 37(8): 186-188.
[4] . 大规模三维地形可视化算法研究进展[J]. 计算机科学, 2007, 34(3): 10-16.
[5] 李常青 唐世渭 李红燕. 基于关联分析的粗粒度级个性化信息挖掘[J]. 计算机科学, 2002, 29(1): 36-38.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学, 2019, 46(8): 1 -8 .
[2] 高李政, 周刚, 罗军勇, 兰明敬. 元事件抽取研究综述[J]. 计算机科学, 2019, 46(8): 9 -15 .
[3] 蔡莉, 李英姿, 江芳, 梁宇. 面向城市热点区域的不平衡数据聚类挖掘研究[J]. 计算机科学, 2019, 46(8): 16 -22 .
[4] 杨震, 王红军. 基于轨迹划分与密度聚类的移动用户重要地点识别方法[J]. 计算机科学, 2019, 46(8): 23 -27 .
[5] 邓存彬, 虞慧群, 范贵生. 融合动态协同过滤和深度学习的推荐算法[J]. 计算机科学, 2019, 46(8): 28 -34 .
[6] 钟凤艳, 王艳, 李念爽. 异构分布式存储系统再生码数据修复的节点选择方案[J]. 计算机科学, 2019, 46(8): 35 -41 .
[7] 孙国道, 周志秀, 李思, 刘义鹏, 梁荣华. 基于地理标签的推文话题时空演变的可视分析方法[J]. 计算机科学, 2019, 46(8): 42 -49 .
[8] 张会兵, 钟昊, 胡晓丽. 基于主题分析的用户评论聚类方法[J]. 计算机科学, 2019, 46(8): 50 -55 .
[9] 李博嘉, 张仰森, 陈若愚. 一种可指定分布的海量数据生成方法[J]. 计算机科学, 2019, 46(8): 56 -63 .
[10] 鲁显光, 杜学绘, 王文娟. 基于改进FP growth的告警关联算法[J]. 计算机科学, 2019, 46(8): 64 -70 .