计算机科学 ›› 2017, Vol. 44 ›› Issue (Z6): 463-469.doi: 10.11896/j.issn.1002-137X.2017.6A.104

• 大数据与数据挖掘 • 上一篇    下一篇

减少重建数据量的冗余编码技术研究

马良荔,柳青   

  1. 海军工程大学电子工程学院 武汉430000,海军工程大学电子工程学院 武汉430000;华中科技大学计算机科学与技术学院 武汉430000
  • 出版日期:2017-12-01 发布日期:2018-12-01

Researches of Redundancy Coding Technologies on Reducing Reconstruction Data Amount

MA Liang-li and LIU Qing   

  • Online:2017-12-01 Published:2018-12-01

摘要: 为防止硬件故障或机器宕机导致的数据丢失,冗余编码技术被广泛应用于分布式存储系统中来保证数据的可靠性。然而,传统的冗余编码技术,如里德-所罗门码,存在着重建数据量大的问题。副本技术在重建丢失数据时只需要读取和传输丢失的数据,而冗余编码需要读取和传输更大的数据量,从而消耗更多的磁盘I/O带宽和网络带宽。因此,基于冗余编码的分布式存储系统在重建数据时将消耗更长的时间,从而将整个系统长时间暴露在一种降级的模式下,进而增加了发生永久性数据丢失的风险。为解决这个问题,减少重建数据量的冗余编码技术不断被提出,然而只有这些冗余编码与传统的里德-所罗门码的比较,缺少它们在存储系统的综合比较。系统地从减少重建数据量等几个重要方面研究了这些减少重建数据量的冗余编码技术,从而为实际系统中采用合适的编码提供重要参考和依据。

关键词: 冗余编码,数据重建,存储系统,分布式系统

Abstract: In order to avoid data loss due to hardware failure or server breakdown,redundancy coding technology is widely employed in distributed storage systems for data reliability.However,traditional erasure codes,such as Reed-Solomon codes,bear the burden of huge rebuilding data amount.Compared with the replication technique,which only needs to read and transfer the lost data,the erasure coding requires to read and transfer a much large amount of data,thereby consuming much more disk I/Os and network bandwidth.Thus,a erasure code based distributed storage system would cost longer time for data reconstruction than a replication based system,and exposes the whole system in a long-term degraded stage,increasing the risk of the permanent data loss.To solve this problem,many repair-bandwidth-efficient codes were constantly proposed,but these codes are only compared with the traditional Reed-Solomon codes and lack the comprehensive comparisons on practical storage systems.We systematically analyzed these repair-bandwidth-efficient codes from the some significant aspects,such as amount reduction on reconstruction data and so on,thus providing valuable basis and references for choosing suitable erasure codes for practical systems.

Key words: Erasure codes,Data construction,Storage system,Distributed system

[1] BORTHAKUR D,SCHMIDT R,VADALI R,et al.Hdfs raid[C]∥Hadoop User Group Meeting.2010.
[2] WEIL S A,BRANDT S A,MILLER E L,et al.Ceph:A scalable,high-performance distributed file system[C]∥Proceedings of the 7th Symposium on Operating Systems Design and Implementation.USENIX Association,2006:307-320.
[3] PLANK J S,DING Y.Note:Correction to the 1997 tutorial on Reed-Solomon coding[J].Software:Practice and Experience,2005,35(2):189-194.
[4] KHAN O,BURNS R C,PLANK J S,et al.Rethinking erasure codes for cloud file systems:minimizing I/O for recovery and degraded reads[C]∥FAST.2012:20.
[5] DIMAKIS A G,GODFREY P,WU Y,et al.Network coding for distributed storage systems[J].IEEE Transactions on Information Theory,2010,56(9):4539-4551.
[6] SATHIAMOORTHY M,ASTERIS M,P APAILOPOULOS D,et al.Xoring elephants:Novel erasure codes for big data[J].Proceedings of the VLDB Endowment,VLDB Endowment,2013,6(5):325-336.
[7] RASHMI K V,SHAH N B,GU D,et al.A solution to the network challenges of data recovery in erasure-coded distributed storage systems:A study on the Facebook warehouse cluster[C]∥Presented as part of the 5th USENIX Workshop on Hot Topics in Storage and File Systems.2013.
[8] SHAH N B,RASHMI K V,KUMAR P V,et al.Interference alignment in regenerating codes for distributed storage:Necessity and code constructions[J].IEEE Transactions on Information Theory,2012,58(4):2134-2158.
[9] RASHMI K V,NAKKIRAN P,WANG J,et al.Having your cake and eating it too:jointly optimal erasure codes for I/O,storage,and network-bandwidth[C]∥13th USENIX Conference on File and Storage Technologies (FAST 15).2015:81-94.
[10] HU Y,CHEN H C H,LEE P P C,et al.NCCloud:applying network coding for the storage repair in a cloud-of-clouds[C]∥FAST.2012:21.
[11] HUANG C,SIMITCI H,XU Y,et al.Erasure coding in windows azure storage[C]∥Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12).2012:15-26.
[12] PAPAILIOPOULOS D S,LUO J,DIMAKIS A G,et al.Simple regenerating codes:Network coding for cloud storage[C]∥2012 Proceedings IEEE INFOCOM.IEEE,2012:2801-2805.
[13] LIU Q,FENG D,JIANGY H,et al.Z Codes:General Systematic Erasure Codes with Optimal Repair Bandwidth and Storage for Distributed Storage Systems[C]∥2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS).IEEE,2015:212-217.
[14] PLANK J S,XU L.Optimizing Cauchy Reed-Solomon codes for fault-tolerant network storage applications[C]∥Fifth IEEE International Symposium on Network Computing and Applications,2006(NCA 2006).IEEE,2006:173-180.
[15] PLANK J S,GREENAN K M,MILLER E L.Screaming fastGalois field arithmetic using intel SIMD instructions[C]∥FAST.2013:299-306.
[16] HUANG C,XU L.STAR:An efficient coding scheme for correcting triple storage node failures[J].IEEE Transactions on Computers,2008,57(7):889-901.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!