Computer Science ›› 2017, Vol. 44 ›› Issue (10): 80-84.doi: 10.11896/j.issn.1002-137X.2017.10.015

Previous Articles     Next Articles

Parallel Design and Optimization of Galaxy Group Finding Algorithm on Comparation of SGI and Distributed-memory Cluster

SI Yu-meng, WEI Jian-wen, Simon SEE and James LIN   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Halo-based galaxy group finder (HGGF) is an effective algorithm that accomplishes the task of galaxy group finding based on galaxy coordinates,redshift and mass etc.,and provides great help in the research of galaxy group formation and evolution.However,current pure OpenMP implementation of the algorithm is limited by the resource of the underlying single compute node when dealing with large-scale group finding problems.One of the possible solutions is using resources from multiple nodes to reduce execution time while solving large-size galaxy group finding problem.Therefore,it is essential to redesign and implement the algorithm.The major hurdle for such an attempt is remoting memory access due to semi-random galaxy access in the algorithm which damages the performance in multi-node environment.To tackle such a problem,we paralleled the algorithm with adjacent galaxy list design and used unified parallel C (UPC) to implement it.2.25,2.78 and 5.07 times speedup for the kernel were achieved with 4,8 and 16 nodes respectively.Meanwhile,the memory requirement on each node was also reduced significantly.Experiments of OpenMP version of the algorithm on SGI UV 2000 show that due to the nature of the program and the features of NUMA architecture,programs with random memory access behavior like HGGF may not readily benefit from the large number of threads and shared memory provided by such machines.Two-level parallel design that takes advantage of locality principle on distributed memory clusters may be a better solution.

Key words: High performance computing,Galaxy group finding,Parallel computing,UPC,OpenMP

[1] KWON Y C,DYLAN N,JEFFREY G,et al.Scalable clusteringalgorithm for N-body simulations in a shared-nothing cluster[M]∥Scientific and Statistical Database Management.Springer Berlin Heidelberg,2010:132-150.
[2] YANG X H,MO H J,VAN DEN B,et al.A halo-based galaxy group finder:calibration and application to the 2dFGRS[J].Monthly Notices of the Royal Astronomical Society,2005,356(4):1293-1307.
[3] YANG X H,MO H J,VAN DEN B,et al.Galaxy groups in the SDSS DR4.I.The catalog and basic properties[J].The Astrophysical Journal,2007,671(1):153.
[4] WILLIAM C,et al.UPC Language Specifications Version 1.3https://upc-lang.org/assets/Uploads/spec/upc-lang-spec-1.3.pdf.
[5] Intel VTune Amplifier XE.https://software.intel.com/en-us/intel-vtune-amplifier-xe.
[6] Technical Advances in the SGI UV Architecture.https://www.sgi.com/pdfs/4192.pdf.
[7] HAO H,SI Y M,WEI J W,et al.Optimizing Irregular Memory Access in Astrophysical Clustering Studies[J].Journal of Frontiers of Computer Science and Technology,2017,11(1):80-90.(in Chinese) 郝赫,司雨蒙,韦建文,等.天体物理成团研究中的非规则访存优化[J].计算机科学与探索,2017,11(1):80-90.
[8] MARK D,GEORGE E,CAROS F,et al.The evolution of large-scale structure in a universe dominated by cold dark matter[J].The Astrophysical Journal,1985,292(2):371-394.
[9] NEAL K,LARS H,DAVID W.Galaxies and gas in a cold dark matter universe[J].The Astrophysical Journal,1992,399(2):L109-L112.
[10] EISENSTEIN D J,HUT P.Hop:A new group-finding algo-rithm for n-body simulations[J].The Astrophysical Journal,1998,498(1):137.
[11] LIU Y,LIAO W K,CHOUDHARY A.Design and evaluation of a parallel HOP clustering algorithm for cosmological simulation[C]∥International Parallel and Distributed Processing Symposium,2003.IEEE,2003.
[12] FU B,REN K,LPEZ J,et al.DiscFinder:A data-intensivescalable cluster finder for astrophysics[C]∥Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing.2010:348-351.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .