Computer Science ›› 2023, Vol. 50 ›› Issue (12): 89-96.doi: 10.11896/jsjkx.221100257

• Database & Big Data & Data Science • Previous Articles     Next Articles

Stripe Matching and Merging Algorithm-based Redundancy Transition for Locally Repairable Codes

DU Qingpeng1, XU Yinlong2, WU Si2   

  1. School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China
  • Received:2022-11-29 Revised:2023-04-07 Online:2023-12-15 Published:2023-12-07
  • About author:DU Qingpeng,born in 1998,postgra-duate.His main research interests include erasure coding and distributed storage system.
    XU Yinlong,born in 1963,Ph.D, professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include storage system,graph computing and high performance computing.
  • Supported by:
    National Natural Science Foundation of China(62172382).

Abstract: Compared with traditional replication technique,erasure coding is another data redundancy mechanism with lower space overhead at the cost of high repair cost.Locally repairable code is a special kind of erasure code with low repair cost,which is widely deployed in big data storage systems.In order to accommodate itself to dynamic workload and varying failure rate of sto-rage media,modern storage systems make better trade off between access performance and reliability for erasure coded data by means of redundancy transitions.A redundancy transition method,which selectively matches and merges stripes of specific layout,decouples data placement and redundancy transition,is proposed.Furthermore,it reduces the cross-rack network traffic by designing cost quantification and optimization model.In contrast to those algorithms that designing data placement,the proposed algorithm has almost the same performance but eliminates the constraints on data layout and can be run iteratively.Experiment results show that under the two common transition setups,compared with naive approach of random layout,the proposed algorithm approximates the theoretical optimum,mitigates network traffic by 27.74% and 27.47%,and shortens transition time by 39.10% and 22.32% respectively.

Key words: Locally repairable codes, Redundancy transition, Fault-tolerant storage technique, Network traffic optimization, Distributed storage system

CLC Number: 

  • TP302.8
[1]WEATHERSPOON H,KUBIATOWICZ J D. Erasure coding vs.replication:A quantitative comparison[C]//International Workshop on Peer-to-Peer Systems.2002:328-337.
[2]PLANK J S.A tutorial on Reed-Solomon coding for fault-tole-rance in RAID-like systems[J].Software:Practice and Expe-rience,1997,27(9):995-1012.
[3]PAPAILIOPOULOS D S,DIMAKIS A G.Locally repairablecodes[J].IEEE Transactions on Information Theory,2014,60(10):5843-5855.
[4]HU Y,CHENG L,YAO Q,et al.Exploiting Combined Locality for Wide-Stripe Erasure Coding in Distributed Storage[C]//19th USENIX Conference on File and Storage Technologies(FAST 21).2021:233-248.
[5]WANG T,SU Z,XIA Y,et al.Rethinking the data center networking:Architecture,network protocols,and resource sharing[J].IEEE Access,2014,2:1481-1496.
[6]ZHOU P,HUANG J,QIN X,et al.PaRS:A popularity-aware redundancy scheme for in-memory stores[J].IEEE Transactions on Computers,2018,68(4):556-569.
[7]BUI D M,HUSSAIN S,HUH E N,et al.Adaptive replication management in HDFS based on supervised learning[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(6):1369-1382.
[8]XIA M,SAXENA M,BLAUM M,et al.A tale of two erasure codes in HDFS[C]//13th USENIX Conference on File and Storage Technologies(FAST 15).2015:213-226.
[9]WILKES J,GOLDING R,STAELIN C,et al.The HP Auto-RAID hierarchical storage system[J].ACM Transactions on Computer Systems(TOCS),1996,14(1):108-136.
[10]FAN B,TANTISIRIROJ W,XIAO L,et al.DiskReduce:RAID for data-intensive scalable computing[C]//Proceedings of the 4th annual workshop on petascale data storage.2009:6-10.
[11]KADEKODI S,RASHMI K V,GANGER G R.Cluster storage systems gotta have HeART:improving storage efficiency by exploiting disk-reliability heterogeneity[C]//17th USENIX Conference on File and Storage Technologies(FAST 19).2019:345-358.
[12]ZHENG W,ZHANG G.FastScale:Accelerate RAID Scaling by Minimizing Data Migration[C]//9th USENIX Conference on File and Storage Technologies(FAST 11).2011.
[13]LI R,HU Y,LEE P P C.Enabling efficient and reliable transition from replication to erasure coding for clustered file systems[J].IEEE Transactions on Parallel and Distributed Systems,2017,28(9):2500-2513.
[14]XU B,HUANG J,CAO Q,et al.TEA:A traffic-efficient era-sure-coded archival scheme for in-memory stores[C]//Procee-dings of the 48th International Conference on Parallel Proces-sing.2019:1-10.
[15]WU S,SHEN Z,LEE P P C.Enabling I/O-efficient redundancy transitioning in erasure-coded KV stores via elastic Reed-Solomon codes[C]//2020 International Symposium on Reliable Distributed Systems(SRDS).IEEE,2020:246-255.
[16]WU S,DU Q,LEE P P C,et al.Optimal Data Placement for Stripe Merging in Locally Repairable Codes[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:1669-1678.
[17]YAO Q,HU Y,CHENG L,et al.Stripemerge:Efficient wide-stripe generation for large-scale erasure-coded storage[C]//2021 IEEE 41st International Conference on Distributed Computing Systems(ICDCS).IEEE,2021:483-493.
[18]KOLMOGOROV V.Blossom V:a new implementation of a mini-mum cost perfect matching algorithm[J].Mathematical Programming Computation,2009,1(1):43-67.
[19]WEIL S A,BRANDT S A,MILLER E L,et al.Ceph:A scalable,high-performance distributed file system[C]//Proceedings of the 7th Symposium on Operating Systems Design and Implementation.2006:307-320.
[20]HUANG C,SIMITCI H,XU Y,et al.Erasure coding in windows azure storage[C]//2012 USENIX Annual Technical Conference(USENIX ATC 12).2012:15-26.
[1] ZHANG Hang, TANG Dan, CAI Hong-liang. Study on Predictive Erasure Codes in Distributed Storage System [J]. Computer Science, 2021, 48(5): 130-139.
[2] ZHANG Xiao, ZHANG Si-meng, SHI Jia, DONG Cong, LI Zhan-huai. Review on Performance Optimization of Ceph Distributed Storage System [J]. Computer Science, 2021, 48(2): 1-12.
[3] ZHONG Feng-yan, WANG Yan, LI Nian-shuang. Node Selection Scheme for Data Repair in Heterogeneous Distributed Storage Systems [J]. Computer Science, 2019, 46(8): 35-41.
[4] . [J]. Computer Science, 2007, 34(6): 47-48.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!