计算机科学 ›› 2023, Vol. 50 ›› Issue (12): 89-96.doi: 10.11896/jsjkx.221100257

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于条带配对合并算法的局部可修复码冗余度转换机制

杜清鹏1, 许胤龙2, 吴思2   

  1. 中国科学技术大学计算机科学与技术学院 合肥 230027
  • 收稿日期:2022-11-29 修回日期:2023-04-07 出版日期:2023-12-15 发布日期:2023-12-07
  • 通讯作者: 许胤龙(ylxu@ustc.edu.cn)
  • 作者简介:(duqingpeng@mail.ustc.edu.cn)
  • 基金资助:
    国家自然科学基金(62172382)

Stripe Matching and Merging Algorithm-based Redundancy Transition for Locally Repairable Codes

DU Qingpeng1, XU Yinlong2, WU Si2   

  1. School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China
  • Received:2022-11-29 Revised:2023-04-07 Online:2023-12-15 Published:2023-12-07
  • About author:DU Qingpeng,born in 1998,postgra-duate.His main research interests include erasure coding and distributed storage system.
    XU Yinlong,born in 1963,Ph.D, professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include storage system,graph computing and high performance computing.
  • Supported by:
    National Natural Science Foundation of China(62172382).

摘要: 相比传统的多副本技术,纠删码是一种以高修复代价换取低存储开销的数据冗余机制。局部可修复码是一类具有低修复代价的纠删码,被广泛应用在大数据存储系统中。为了应对动态变化的工作负载和存储介质动态改变的故障率,现代存储系统需要对纠删码数据进行冗余度转换,以调节数据访问性能和可靠性。设计了一种基于条带配对合并的局部可修复码冗余度转换方法,通过选择特定位置的条带进行配对合并,实现了冗余度转换与数据布局的解耦合;进一步通过设计代价量化方法与最优化模型,降低了冗余度转换的网络通信开销。相比设计数据布局的算法,所提算法有与其近似的性能,但对数据布局无限制,可级联迭代地多次运行。实验结果表明,在两种冗余度转换设置下,所提算法均近似于理论最优值,相比随机布局的朴素算法,网络流量分别降低了27.74%和27.47%,耗时分别缩短了39.10%和22.32%。

关键词: 局部可修复码, 冗余度转换, 容错存储技术, 网络流量优化, 分布式存储系统

Abstract: Compared with traditional replication technique,erasure coding is another data redundancy mechanism with lower space overhead at the cost of high repair cost.Locally repairable code is a special kind of erasure code with low repair cost,which is widely deployed in big data storage systems.In order to accommodate itself to dynamic workload and varying failure rate of sto-rage media,modern storage systems make better trade off between access performance and reliability for erasure coded data by means of redundancy transitions.A redundancy transition method,which selectively matches and merges stripes of specific layout,decouples data placement and redundancy transition,is proposed.Furthermore,it reduces the cross-rack network traffic by designing cost quantification and optimization model.In contrast to those algorithms that designing data placement,the proposed algorithm has almost the same performance but eliminates the constraints on data layout and can be run iteratively.Experiment results show that under the two common transition setups,compared with naive approach of random layout,the proposed algorithm approximates the theoretical optimum,mitigates network traffic by 27.74% and 27.47%,and shortens transition time by 39.10% and 22.32% respectively.

Key words: Locally repairable codes, Redundancy transition, Fault-tolerant storage technique, Network traffic optimization, Distributed storage system

中图分类号: 

  • TP302.8
[1]WEATHERSPOON H,KUBIATOWICZ J D. Erasure coding vs.replication:A quantitative comparison[C]//International Workshop on Peer-to-Peer Systems.2002:328-337.
[2]PLANK J S.A tutorial on Reed-Solomon coding for fault-tole-rance in RAID-like systems[J].Software:Practice and Expe-rience,1997,27(9):995-1012.
[3]PAPAILIOPOULOS D S,DIMAKIS A G.Locally repairablecodes[J].IEEE Transactions on Information Theory,2014,60(10):5843-5855.
[4]HU Y,CHENG L,YAO Q,et al.Exploiting Combined Locality for Wide-Stripe Erasure Coding in Distributed Storage[C]//19th USENIX Conference on File and Storage Technologies(FAST 21).2021:233-248.
[5]WANG T,SU Z,XIA Y,et al.Rethinking the data center networking:Architecture,network protocols,and resource sharing[J].IEEE Access,2014,2:1481-1496.
[6]ZHOU P,HUANG J,QIN X,et al.PaRS:A popularity-aware redundancy scheme for in-memory stores[J].IEEE Transactions on Computers,2018,68(4):556-569.
[7]BUI D M,HUSSAIN S,HUH E N,et al.Adaptive replication management in HDFS based on supervised learning[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(6):1369-1382.
[8]XIA M,SAXENA M,BLAUM M,et al.A tale of two erasure codes in HDFS[C]//13th USENIX Conference on File and Storage Technologies(FAST 15).2015:213-226.
[9]WILKES J,GOLDING R,STAELIN C,et al.The HP Auto-RAID hierarchical storage system[J].ACM Transactions on Computer Systems(TOCS),1996,14(1):108-136.
[10]FAN B,TANTISIRIROJ W,XIAO L,et al.DiskReduce:RAID for data-intensive scalable computing[C]//Proceedings of the 4th annual workshop on petascale data storage.2009:6-10.
[11]KADEKODI S,RASHMI K V,GANGER G R.Cluster storage systems gotta have HeART:improving storage efficiency by exploiting disk-reliability heterogeneity[C]//17th USENIX Conference on File and Storage Technologies(FAST 19).2019:345-358.
[12]ZHENG W,ZHANG G.FastScale:Accelerate RAID Scaling by Minimizing Data Migration[C]//9th USENIX Conference on File and Storage Technologies(FAST 11).2011.
[13]LI R,HU Y,LEE P P C.Enabling efficient and reliable transition from replication to erasure coding for clustered file systems[J].IEEE Transactions on Parallel and Distributed Systems,2017,28(9):2500-2513.
[14]XU B,HUANG J,CAO Q,et al.TEA:A traffic-efficient era-sure-coded archival scheme for in-memory stores[C]//Procee-dings of the 48th International Conference on Parallel Proces-sing.2019:1-10.
[15]WU S,SHEN Z,LEE P P C.Enabling I/O-efficient redundancy transitioning in erasure-coded KV stores via elastic Reed-Solomon codes[C]//2020 International Symposium on Reliable Distributed Systems(SRDS).IEEE,2020:246-255.
[16]WU S,DU Q,LEE P P C,et al.Optimal Data Placement for Stripe Merging in Locally Repairable Codes[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:1669-1678.
[17]YAO Q,HU Y,CHENG L,et al.Stripemerge:Efficient wide-stripe generation for large-scale erasure-coded storage[C]//2021 IEEE 41st International Conference on Distributed Computing Systems(ICDCS).IEEE,2021:483-493.
[18]KOLMOGOROV V.Blossom V:a new implementation of a mini-mum cost perfect matching algorithm[J].Mathematical Programming Computation,2009,1(1):43-67.
[19]WEIL S A,BRANDT S A,MILLER E L,et al.Ceph:A scalable,high-performance distributed file system[C]//Proceedings of the 7th Symposium on Operating Systems Design and Implementation.2006:307-320.
[20]HUANG C,SIMITCI H,XU Y,et al.Erasure coding in windows azure storage[C]//2012 USENIX Annual Technical Conference(USENIX ATC 12).2012:15-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!