一种面向云存储数据容错的ARC缓存淘汰机制

Abstract

Abstract: Hadoop adopts replicas as the default data fault tolerance,while this fault tolerance mechanism occupies much storage space and has a low storage efficiency.To solve this problem,this paper proposed an adaptive replacement cache mechanism for fault tolerance in cloud storage,based on analyzing the cache replacement algorithm ARC.When users access the file system,ARCMFF statistics files’ access frequency by maintaining a LRU queue and a LFU queue,and then adds those files highly accessed to the cache for better system performance.In ARCMFF,the majority of files are stored by erasure code,while only few files in the cache are stored by replica.The erasure code is highly encoded and has a higher storage efficiency,resulting that the distributed file system can save amount of storage space.According to series of experiments,we approved that the distributed file systems with ARCMFF can save file storage space greatly,improve storage efficiency and achieve higher reading and writing performance.

Key words: Cloud storage,Data fault tolerance,Hadoop,Replica,Erasure code,ARC

WU Qiu-ping, LIU Bo and LIN Wei-wei. Adaptive Replacement Cache Mechanism for Fault Tolerance in Cloud Storage[J].Computer Science, 2015, 42(Z6): 332-336.

References

[1] 郭全中,郭凤娟.大数据时代下的媒体机遇.http://media.people.com.cn/n/2014/0304/c192370-24525582.html
[2] Pinheiro E,Weber W D,Barroso L A.Failure trends in a large disk drive population[C]∥Proc of the 5th USENIX Conf on File and Storage Technologies.Berkeley.CA:USENIX Association,2007:17-28
[3] Schroeder B,Gibson G A.Disk failures in the real world:What does an MTTF of 1,000,000 hours mean to you?[C]∥Proc of the 5th USENIX Conf on File and Storage Technologies.Berkeley.CA:USENIX Association,2007:1-16
[4] Bairavasundaram L N,Goodson G R,Pasupathy S,et al.An analysis of latent sector errors in disk drives[C]∥Proc of 2007 ACM SIGMETRICS IntConf on Measurement and Modeling of Computer Systems.New York:ACM,2007:289-300
[5] Satyanarayanan M,Howard J H,Nichols D A,et al.The ITC distributed file system:principles and design[M].ACM,1985
[6] Ghemawat S,Gobioff H,Leung S T.The Google file system[C]∥ ACM SIGOPS Operating Systems Review.ACM,2003,37(5):29-43
[7] Borthakur D.The hadoop distributed file system:Architectureand design[J].Hadoop Project Website,2007,11:21
[8] Palankar M R,Iamnitchi A,Ripeanu M,et al.Amazon S3 for science grids:a viable solution?[C]∥Proceedings of the 2008 international workshop on Data-aware distributed computing.ACM,2008:55-64
[9] Chu Yu.淘宝TFS的wiki.http://code.taobao.org/p/ tfs/wiki/index/
[10] McAuley A J.Reliable broadband communication using a burst erasure correcting code[J].ACM SIGCOMM Computer Communication Review,1990,20(4):297-306
[11] Weatherspoon H,Kubiatowicz J D.Erasure coding vs.replica-tion:A quantitative comparison[M]∥Peer-to-Peer Systems.Springer Berlin Heidelberg,2002:328-337
[12] Wu L,Liu B,Lin W.A Dynamic Data Fault-Tolerance Mechanism for Cloud Storage[C]∥2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies(EIDWT).IEEE,2013:95-99
[13] 林伟伟.一种改进的Hadoop数据放置策略[J].华南理工大学学报:自然科学版,2012,40(1):152-158
[14] 利业鞑,林伟伟.一种Hadoop数据复制优化方法[J].计算机工程与应用,2012,48(21):58-61
[15] 林伟伟,刘波.基于动态带宽分配的Hadoop数据负载均衡方法[J].华南理工大学学报:自然科学版,2012,0(9):42-47
[16] 林伟伟,贺品嘉,刘波.云存储系统的能耗优化节点管理方法[J].华南理工大学学报:自然科学版,2014,42(1):104-110
[17] Megiddo N,Modha D S.ARC:A Self-Tuning,Low Overhead Replacement Cache[C]∥FAST.2003,3:115-130
[18] 罗象宏,舒继武.存储系统中的纠删码研究综述[J].计算机研究与发展,2012,49(1):1-11
[19] Lin W K,Chiu D M,Lee Y B.Erasure Code Replication Revisited[C]∥Peer-to-Peer Computing.2004:90-97
[20] 康殿统,王文娟,杨雯.关于 Pareto 分布的一个综合研究[J].河西学院学报,2008,24(2):1-5

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Adaptive Replacement Cache Mechanism for Fault Tolerance in Cloud Storage

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0