计算机科学 ›› 2021, Vol. 48 ›› Issue (5): 130-139.doi: 10.11896/jsjkx.200300124

• 数据库&大数据&数据科学 • 上一篇    下一篇

分布式存储系统中的预测式纠删码研究

张航, 唐聃, 蔡红亮   

  1. 成都信息工程大学软件工程学院 成都610225
  • 收稿日期:2020-03-23 修回日期:2020-07-20 出版日期:2021-05-15 发布日期:2021-05-09
  • 通讯作者: 唐聃(tangdan@foxmail.com)
  • 基金资助:
    四川省科技计划项目(20ZDYF1156);人工智能重大专项(2018GZDZX0030);四川省科技成果转移转化示范项目(2018CC0093)

Study on Predictive Erasure Codes in Distributed Storage System

ZHANG Hang, TANG Dan, CAI Hong-liang   

  1. School of Software Engineering,Chengdu University of Information Technology,Chengdu 610225,China
  • Received:2020-03-23 Revised:2020-07-20 Online:2021-05-15 Published:2021-05-09
  • About author:ZHANG Hang,born in 1995,postgra-duate.His main research interests include coding theory and distributed storage systems.(1521717495@qq.com)
    TANG Dan,born in 1982,Ph.D,professor,is a member of China Computer Federation.His research main research interests include coding theory and distributed storage systems.
  • Supported by:
    Science and Technology Program of Sichuan Province(20ZDYF1156),Major AI Projects(2018GZDZX0030) and Sichuan Science and Technology Achievements Transfer and Transformation Demonstration Project(2018CC0093).

摘要: 纠删码消耗的存储空间较少,获得的数据可靠性较高,因此被分布式存储系统广泛采用。但纠删码在修复数据时较高的修复成本限制了其应用。为了降低纠删码的修复成本,研究人员在分组码和再生码上进行了大量的研究。由于分组码和再生码属于被动容错方式,对于一些容易出现失效的节点,采用主动容错的方式能更好地降低修复成本,维护系统的可靠性,因此,提出了一种主动容错的预测式纠删(Proactive basic-Pyramid,PPyramid)码。PPyramid码利用硬盘故障预测方法来调整basic-Pyramid码中冗余块和数据块之间的关联,将预测出的即将出现故障的硬盘划分到同一小组,使得在修复数据时,所有的读取操作在小组内进行,从而减少读取数据块的个数,节省修复成本。在基于Ceph搭建的分布式存储系统中,在修复多个硬盘故障时,将PPyramid码与其他常用的纠删码进行对比。实验结果表明,相比basic-Pyramid码,PPyramid码能降低6.3%~34.9%的修复成本和减少7.6%~63.6%的修复时间,相比LRC码、pLRC码、SHEC码、DLRC码,能降低8.6%~52%的修复成本和减少10.8%~52.4%的修复时间。同时,PPyramid码构造灵活,具有很强的实际应用价值。

关键词: 分布式存储系统, 故障预测, 纠删码, 数据修复, 硬盘故障

Abstract: Erasure coding consumes less storage space and obtains a higher data reliability,thus being widely used by distributed storage systems.However,when erasure codes are used to repair data,their high repair costs limit their application.In order to reduce the repair cost of erasure codes,researchers have researched a lot on block codes and regenerative codes.But block codes and regeneration codes are passive fault tolerance.For some nodes that are prone to failure,using active fault tolerance can better reduce repair costs and maintain the system reliability.Therefore,this paper proposes a proactive basic-Pyramid(PPyramid) code.The PPyramid code uses the hard disk failure prediction method to adjust the association between redundant and data blocks in the Pyramid code,divides hard disks that are predicted to fail into the same group,thus making all read operations to be performed within the team when recovering data,thereby reducing the number of read data blocks and saving repair costs.In a distributed storage system based on Ceph,it is compared with other commonly used erasure codes,when repairing multiple hard drives.Experimental results show that,PPyramid codes can reduce repair costs by 6.3%~34.9% and decrease repair time by 7.6%~63.6% compared with basic-Pyramid.Compared with LRC code,pLRC code,SHEC code and DLRC code,it can reduce repair costs by 8.6%~52% and decrease repair time by 10.8%~52.4%.Meanwhile,PPyramid codes are flexible in construction and have strong practical application value.

Key words: Data repair, Distributed storage system, Erasure codes, Failure prediction, Hard disk failure

中图分类号: 

  • TP302.8
[1]Daily economic news,China's total data will account for 20% of global data in 2020.Information infrastructure protection is the key to big data security[EB/OL].https://baijiahao.baidu.com/s?id=1601722855211864246&wfr=spider&for=pc.
[2]WANG Y J,SUN W,ZHOU S,et al.Key technologies of distributed storage in cloud computing environment [J].Journal of Software,2012,(4):232-256.
[3]WANG Y J,LI S.Research and performance evaluation of data replication technology in distributed storage systems[J].Computers & Mathematics with Applications,2006,51(11):1625-1632.
[4]HUANG C,SIMITCI H,XU Y,et al.Erasure coding in win-dows azure storage[C]//Proceedings of the 2012 USENIX Conference on Annual Technical Conference.USA:USENIX Association,2012:2-2.
[5]HUANG C,CHEN M,LI J.Pyramid codes:Flexible schemes totrade space for access efficiency in reliable data storage systems[C]//Sixth IEEE International Symposium on Network Computing and Applications(NCA 2007).Piscataway:IEEE,2007:79-86.
[6]SATHIAMOORTHY M,ASTERIS M,PAPAILIOPOULOSD,et al.XORing Elephants:Novel Erasure Codes for Big Data[C]//VLDB2013:Proceedings of the 39th International Confe-rence on Very Large Data Bases.Trento:VLDB Endowment,2013:325-336.
[7]ZHOU S,WANG Y J.EXPyramid:An Array-Based FlexibleCoding Scheme with High Fault-Tolerance and Low Recovery-Overhead[J].Journal of Computer Research & Development,2011,48(s1):30-36.
[8]MENG Y,ZHANG L,XU D,et al.A Dynamic Erasure Code Based on Block Code[C]//Proceedings of the 2019 International Conference on Embedded Wireless Systems and Networks.USA:Junction Publishing,2019:379-383.
[9]MIYAMAE T,NAKAO T,SHIOZAWA K.Erasure code with shingled local parity groups for efficient recovery from multiple disk failures[C]//HotDep 2014:Proceedings of the 10th Workshop on Hot Topics in System Dependability.USA:USENIX Association,2014:5-5.
[10]HAFNER,JAMES L.WEAVER Codes:Highly Fault Tolerant Erasure Codes for Storage Systems[C]//Proceedings of the FAST '05 Conference on File and Storage Technologies.USA:USENIX Association,2005:16-16.
[11]RASHMI K V,SHAH N B,KUMAR P V,et al.Explicit construction of optimal exact regenerating codes for distributed storage[C]//2009 47th Annual Allerton Conference on Communication,Control,and Computing(Allerton).Piscataway:IEEE,2009:1243-1249.
[12]WU Y,DIMAKIS A G.Reducing repair traffic for erasure co-ding-based storage via interference alignment[C]//ISIL2009:Proceedings of the 2009 IEEE international conference on Symposium on Information Theory.Piscataway:IEEE,2009:2276-2280.
[13]SCHROEDER B,GIBEON G A.Disk failures in the real world:What does an MTTF of 1 000 000 hours mean to you?[J].ACM Transactions on Storag,2007,7(1):1-16.
[14]HUGHES G F,MURRAY J F,KREUTZDELGADO K,et al.Improved disk-drive failure warnings[J].IEEE Transactions on Reliability,2002,51(3):350-357.
[15]LI P,LI J,STONES R J,et al.ProCode:A Proactive Erasure Coding Scheme for Cloud Storage Systems[C]//Proceedings of the 2016 IEEE 35th Symposium on Reliable Distributed Systems.Piscataway:IEEE,2016:219-228.
[16]HU Y,LIU Y,LI W,et al.Unequal Failure Protection Coding Technique for Distributed Cloud Storage Systems[J].IEEE Transactions on Cloud Computing,2017(99):1-1.
[17]ZHANG X Y,XU J,HU Y.Predictive Local Repair Codes in Cloud Storage Systems [J].Journal of Computer Research and Development,2019,56(9):1988-2000.
[18]ZHU B,WANG G,LIU X,et al.Proactive drive failure prediction for large scale storage systems[C]//Proceedings of the 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies.Piscataway:IEEE,2013:1-5.
[19]HAMERLY G,ELKAN C.Bayesian approaches to failure prediction for disk drives[C]//Proceedings of the Eighteenth International Conference on Machine Learning.USA:Morgan Kaufmann Publishers Inc,2001,1:202-209.
[20]HUGHES G F,MURRAY J F,KREUTZ-DELGADO K,et al.Improved disk-drive failure warnings[J].IEEE transactions on reliability,2002,51(3):350-357.
[21]MURRAY J F,HUGHES G F,KREUTZ-DELGADO K.Machine Learning Methods for Predicting Failures in Hard Drives:A Multiple-Instance Application[J].Journal of Machine Lear-ning Research,2005,6(1):783-816.
[22]LI J,JI X,JIA Y,et al.Hard Drive Failure Prediction Using Classification and Regression Trees[C]//Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.Piscataway:IEEE,2014:383-394.
[23]CIDON A,ESCRIVA R,KATTI S,et al.Tiered replication:acost-effective alternative to full cluster geo-replication[C]//Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference.USA:USENIX Association,2015:31-43.
[24]FORD D,LABELLE F,POPOVICI F I,et al.Availability inGlobally Distributed Storage Systems.[C]//Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation.USA:USENIX Association,2010:61-74.
[25]SILBERSTEIN M,GANESH L,WANG Y,et al.Lazy means smart:Reducing repair bandwidth costs in erasure-coded distri-buted storage[C]//Proceedings of International Conference on Systems and Storage.New York:Association for Computing Machinery,2014:1-7.
[1] 曾友渝, 谢强.
基于改进RNN和VAR的船舶设备故障预测方法
Fault Prediction Method Based on Improved RNN and VAR for Ship Equipment
计算机科学, 2021, 48(6): 184-189. https://doi.org/10.11896/jsjkx.200700117
[2] 张晓, 张思蒙, 石佳, 董聪, 李战怀.
Ceph分布式存储系统性能优化技术研究综述
Review on Performance Optimization of Ceph Distributed Storage System
计算机科学, 2021, 48(2): 1-12. https://doi.org/10.11896/jsjkx.201000149
[3] 钟凤艳, 王艳, 李念爽.
异构分布式存储系统再生码数据修复的节点选择方案
Node Selection Scheme for Data Repair in Heterogeneous Distributed Storage Systems
计算机科学, 2019, 46(8): 35-41. https://doi.org/10.11896/j.issn.1002-137X.2019.08.006
[4] 王楠, 孙善武.
基于半监督聚类分析的无人机故障识别
UAV Fault Recognition Based on Semi-supervised Clustering
计算机科学, 2019, 46(6A): 192-195.
[5] 刘琴.
计算机取证过程中基于约束的数据质量问题研究
Study on Data Quality Based on Constraint in Computer Forensics
计算机科学, 2018, 45(4): 169-172. https://doi.org/10.11896/j.issn.1002-137X.2018.04.028
[6] 吴炀,付印金,陈卫卫,倪桂强.
一种高效的混合内存布局机制与编码技术
Efficient Mechanism of Hybrid Memory Placement and Erasure Code
计算机科学, 2017, 44(6): 57-62. https://doi.org/10.11896/j.issn.1002-137X.2017.06.009
[7] 金星彤,李鹏,王刚,刘晓光,李忠伟.
基于异或的隐私保护码优化研究
Optimizing Small XOR-based Non-systematic Erasure Codes
计算机科学, 2017, 44(6): 36-42. https://doi.org/10.11896/j.issn.1002-137X.2017.06.006
[8] 刘波,蔡美,周绪川.
数据修复与一致性查询处理研究
Study on Data Repair and Consistency Query Processing
计算机科学, 2016, 43(1): 232-236. https://doi.org/10.11896/j.issn.1002-137X.2016.01.050
[9] 谢晨,王睿智,李 飏,苗夺谦,焦 娜.
数据驱动的燃气涡轮机跳闸预警方法的研究
Research of Data Driven Method for Gas Turbine Trip Prediction
计算机科学, 2015, 42(6): 23-27. https://doi.org/10.11896/j.issn.1002-137X.2015.06.005
[10] 王禹,赵跃龙,侯昉.
基于矩阵运算的最小冗余存储再生码MSRRC研究
Minimum Redundancy Storage Regeneration Code Research MSRRC Based on Matrix Operation
计算机科学, 2014, 41(Z11): 191-194.
[11] 陈达智,赵荣彩,姚远,韩林.
MPI自动并行化编译系统中消息传递代码生成算法
Message-passing Code Generation Algorithm in the MPI Automatic Parallelizing Compilation System
计算机科学, 2012, 39(6): 301-304.
[12] 杜军朝,刘惠,李晓军,张荧俊,张云扬.
基于RS纠删码的无线传感器网络信息分发协议性能评价
Performance Evaluation of Information Dissemination Protocol in WSNs Based on RS Erasure Codes
计算机科学, 2011, 38(Z10): 315-318.
[13] 姜国松,邹辰,谢长生.
RAID控制器中矩阵重构方法研究
Research on Matrix Reconstruction in RAID Controller
计算机科学, 2009, 36(7): 262-266. https://doi.org/10.11896/j.issn.1002-137X.2009.07.064
[14] .
P2P分布式存储系统

计算机科学, 2007, 34(6): 47-48.
[15] .
基于专家知识库属性重要度的故障诊断方法研究

计算机科学, 2007, 34(4): 231-233.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!