计算机科学 ›› 2013, Vol. 40 ›› Issue (1): 144-149.

• 软件与数据库技术 • 上一篇    下一篇

基于可用性度量的分布式文件系统节点失效恢复算法

廖 彬,于 炯,钱育蓉,杨兴耀   

  1. (新疆大学软件学院 乌鲁木齐830008);(新疆大学信息科学与工程学院 乌鲁木齐830046)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Node Failure Recovery Algorithm for Distributed File System Based on Measurement of Data Availability

  • Online:2018-11-16 Published:2018-11-16

摘要: 现有分布式文件系统中处理节点失效时采用的恢复策略耗费较多的带宽与磁盘空间资源,且影响系统的稳定性。通过研究分布式文件系统HDFS集群结构、数据块存储机制、节点与数据块状态之间的关系,定义了集群节点矩阵、节点状态矩阵、文件分块矩阵、数据块存储矩阵与数据块状态矩阵为度量数据块可用性建立了基础数据模型。在实现数据块可用性度量基础上,设计了基于可用性度量的节点失效恢复算法并分析了算法的性能。实验结果表明:新算法在保证系统中所有数据块可用性的前提下比原恢复策略减少了恢复所需带宽与磁盘资源,缩短了节点恢复时间,提高了系统稳定性。

关键词: 云计算,分布式文件系统,失效恢复,可用性度量

Abstract: The strategy for distributed file system dealing with node failure needs much bandwidth and disk space resources and affects stability of the system. By studying HDFS' s cluster structure, data blocks storage mechanism, the state relationship between node and block, we defined the cluster nodes matrix, node status matrix, file block partition matrix, block storage matrix and block state matrix Those definitions enable us to model the availability of data block easily. Based on the measurement of data block's availability,we proposed the new node failure recovery algorithm and analyzed the performance of the algorithm. The experimental results show that compared with the original strategy, the new algorithm ensures the availability of all blocks in the system and reduces the bandwidth and disk space resources for recovery, shorts the recovery time, and improvs the stability of system.

Key words: Cloud computing,Distributed file system,Failure recovery,Measurement of data availability

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!