计算机科学 ›› 2017, Vol. 44 ›› Issue (10): 75-79, 90.doi: 10.11896/j.issn.1002-137X.2017.10.014

• 2016 全国高性能计算学术年会 • 上一篇    下一篇

高能物理环境中混合存储系统的设计与优化

徐琪,程耀东,陈刚   

  1. 中国科学院大学物理科学学院 北京100049;中国科学院高能物理研究所 北京100049,中国科学院高能物理研究所 北京100049,中国科学院高能物理研究所 北京100049
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金项目:高能物理实验的大规模离线数据存储技术研究(11575223),国家重点研发计划项目:科学大数据管理系统(2016YFB1000605),国家自然科学基金项目:基于SDN的高能物理云数据中心弹性网络关键技术研究与应用(11605224)资助

Design and Optimization of Hybrid Storage System in HEP Environment

XU Qi, CHENG Yao-dong and CHEN Gang   

  • Online:2018-12-01 Published:2018-12-01

摘要: 高能物理是典型的数据密集型计算环境,数据处理包括模拟计算、重建计算以及物理分析。其中大文件计算占据较大比重,并且高能物理文件访问模式以跳读为主,因此大文件的高速访问成为整个系统性能的重要影响因素。首先剖析传统高能物理计算环境的典型架构及其文件访问模式的特点,介绍混合存储模式在高能物理计算环境中的优势,总结其数据访问方式的特点,对其各种读写方式进行数据测试;然后提出针对该环境的混合存储系统的部署设计和优化,使该环境下的数据读写性能得到明显提高;同时将成本因素考虑到系统设计中,实现了一个低成本高性能的存储系统。测试表明,混合存储系统在高能物理等大数据存储系统中具有高效的I/O性能。文中全面分析了影响其性能的各种因素,实现了最优化配置的低成本高性能混合存储系统,并对该系统的未来发展趋势进行了分析和展望。

关键词: 海量存储系统,高能物理,混合存储系统,缓存,块设备,高性能计算,性价比

Abstract: Computing in high energy physics (HEP) is a typical data-intensive application including simulation,reconstruction and physical analysis.Generally,the HEP experiment file is very big and the way of accessing to the files is usually skipping through large data blocks.Therefore,the performance of accessing to big files is one of decisive factors for the HEP computing system.Firstly,this paper analyzed the typical structure of the computing environment in high energy physics and the characters of accessing to files,introduced the advantages of hybrid storage system in high energy physics,summarized the characteristics of data access mode,evaluated the performance of different read/write mode,then proposed a new deployment model of hybrid storage system in high energy physics,which is proved to have higher I/O performance,at the same time the cost was considered to implement a high-performance system with low cost.The test result shows that the hybrid storage system has good performance in some fields such as HEP.Based on the analysis,it can help to get better I/O performance with lower price in High Energy Physics.At the last,the future of the hybrid storage system was analyzed.

Key words: Mass storage system,HEP,Hybrid storage system,Cache,Block device,HPC,Performance price ratio

[1] WLCG-Worldwide LHC Computing Grid.http://lcg.web.cern.ch/LCG.
[2] CABRERA L,LONG D D E.Swift:Using Distributed Disk Striping to Provide High I/O Data Rates[J].Computing Systems,1991,4(4):402-441.
[3] CHENG Y D,SHI J Y,CHEN G.A survey of High Energy Physics Computing System[J].e-Science Technology & Application,2014,5(3):3-10.(in Chinese) 程耀东,石京燕,陈刚.高能物理计算环境概述[J].科研信息化技术与应用,2014,5(3):3-10.
[4] CHENG Y D,WANG L,HUANG Q L,et al.Design and Optimization of Storage System in HEP Computing Environment[J].Computer Science,2015,42(1):54-58.(in Chinese) 程耀东,汪璐,黄秋兰,等.高能物理计算环境中存储系统的设计与优化[J].计算机科学,2015,42(1):54-58.
[5] MITUZAS D.Flashcache at Facebook From 2010 to 2013 and beyond[EB/OL].[2013-10-9].https://www.facebook.com/notes/facebook-engineering.
[6] Facebook/Flashcache.https://github.com/facebook/flash-cache.
[7] MORE A,GANJEWAR P.Dynamic Cache Resizing in Flash-cache[J].Advances in Intelligent Systems and Computing,2015,327:537-544.
[8] 敖青云.存储技术原理分析:基于Linux 2.6内核源代码[M].北京:电子工业出版社,2011:363-476.
[9] LEE D,CHOI J,KIM J H,et al.On the Existence of a Spectrum of Policies That Subsumes the Least Recently Used (LRU) and Least Frequently Used (LFU) Policies[J].SIGMETRICS Performance Evaluation Review,1999,27(1):134-143.
[10] RAMOS L E,GORBATOV E,B IANCHINI R.Page placement in hybrid memory systems.[C]∥ Proceedings of the international conference on Supercomputing.ACM,2011:85-95.
[11] YANG Z Y.Design and Implementation of Hybrid StorageScheme Based on Flashcache[D].Wuhan:Huazhong University of Science and Technology,2013.(in Chinese) 杨昭宇.基于Flashcache的混合存储方案设计与实现[D].武汉:华中科技大学,2013.
[12] Freecode/fio.http://freecode.com/projects/fio.
[13] FRHWIRT P,HUBER M,MULAZZANI M,et al.InnoDBDatabase Forensics[C]∥24th IEEE International Conference on Advanced Information Networking and Applications (AINA).IEEE,2010:1028-1036.
[14] LIN T J.Principle,Composition and Application of Several Typi-cal Solid State Drives[J].Computer and External Equipment,1999(1):16-21.(in Chinese) 林天静.几种典型固态盘的原理、组成形式及应用[J].电子计算机与外部设备,1999(1):16-21.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[4] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[5] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[6] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[7] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[8] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[9] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[10] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99, 116 .