计算机科学 ›› 2017, Vol. 44 ›› Issue (5): 178-183.doi: 10.11896/j.issn.1002-137X.2017.05.032

• 软件与数据库技术 • 上一篇    下一篇

面向大数据分布式存储的动态负载均衡算法

张栗粽,崔园,罗光春,陈爱国,卢国明,王晓雪   

  1. 电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受四川省科技厅应用基础(2015JY0228),科技支撑计划(2015SZ0045,2014GZ0174),电子科大基础研究(ZYGX2015J063),海外留学回国人员科研启动费项目基金资助

Dynamic Load Balance Algorithm for Big-data Distributed Storage

ZHANG Li-zong, CUI Yuan, LUO Guang-chun, CHEN Ai-guo, LU Guo-ming and WANG Xiao-xue   

  • Online:2018-11-13 Published:2018-11-13

摘要: 随着大数据时代的到来,分布式存储技术应运而生。目前主流大数据技术Hadoop的HDFS分布式存储系统的元数据存储架构上一直存在可扩展性差和写延迟高等问题,其在官方2.0版本中针对可扩展性的解决方案(Fe-deration)仍不完美,仅解决了原有HDFS扩展性的问题,在元数据分配的问题上没有考虑NameNode的异构性能差异,也未解决NameNode集群动态负载均衡的问题。针对该情况,提出了一种动态负载均衡的分布NameNode算法,通过元数据多副本异构节点的动态适应性备份,使元数据在考虑节点性能及负载的情况下实现了动态分布,保证了元数据服务器集群的性能;同时结合缓存策略及自动恢复机制,提高了元数据的读写性及可用性。该算法在试验验证中达到了较为理想的效果。

关键词: 大数据,分布式存储,元数据管理,HDFS

Abstract: Distributed storage is the major approach for handling the “Big Data”.Currently,the major technology is hadoop distributed file system (HDFS),which has been beset by the issues of scalability and write latency.In official 2.0 version,a new feature‘HDFS Federation’ addresses this limitation by adding support for multiple NameNodes/name spaces to HDFS.However,it does not take the isomerism of NameNode into account,and still lacks of dynamic load balance ability.Consequently,a dynamic load balance algorithm for HDFS NameNode was proposed,and it dynamically allocated the metadata into a NameNodes cluster with multiple copies,in order to improve the performance of metadata utilizations.In addition,the proposed algorithm increases the readability by the adoption of metadata caches,and improves the stability by a built-in failover mechanism.Finally,an experiment was carried out,to illustrate and evaluate the utilizations of the proposed algorithm.

Key words: Big data,Distributed file storage,Meta data management,Hadoop distributed file system (HDFS)

[1] GANTZ J,REINSEL D.The digital universe in 2020:Big data,bigger digital shadows,and biggest growth in the far east.IDC iView:IDC Analyze the future[R/OL].https://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf.
[2] GANTZ J F.The Diverse and Exploding Digital Universe.An Idc White Paper Retrieved [R/OL].https://italy.emc.com/collateral/analyst-reports/emc-digital-universe-china-brief.pdf.
[3] TATE J,LUCCHESE F,Moore R,et al.Introduction to Storage Area Networks[M].Vervante,2006.
[4] GIBSON G A,VAN METER R.Network attached storage architecture[J].Communications of the Acm,2000,43(11):37-45.
[5] SHVACHKO K,KUANG H,RADIA S,et al.The hadoop distributed file system[C]∥2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).IEEE,2010:1-10.
[6] WHITE T.Hadoop:The Definitive Guide[M].Yahoo! Press,2011.
[7] ZHANG X.Research and Implementation of Cloud Storage Platform Based on Hadoop[D].Chengdu:University of Electronics and Technology of China,2013.(in Chinese) 张兴.基于Hadoop的云存储平台的研究与实现[D].成都:电子科技大学,2013.
[8] GHEMAWAT S,GOBIOFF H,LEUNG S T.The Google file system[J].Acm Sigops Operating Systems Review,2003,37(5):29-43.
[9] BORTHAKUR D.HDFS architecture guide[EB/OL].https://hadoop.apache.org/docs/r1.2.1/hdfs_design.pdf.
[10] SASHI K,THANAMANI A S.Dynamic replication in a data grid using a Modified BHR Region Based Algorithm[J].Future Generation Computer Systems,2011,27(2):202-210.
[11] TATEBE O,HIRAGE K,SODA N.Gfarm Grid File System[J].New Generation Computing,2010,28(3):257-275.
[12] Hadoop Apache Project,HDFS Federation .http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Federation.html.
[13] AZZEDIN F.Towards a scalable HDFS architecture[C]∥2013 International Conference on Collaboration Technologies and Systems (CTS).IEEE,2013:155-161.
[14] STOICA I,MORRIS R,KARGER D,et al.Chord:A scalablepeer-to-peer lookup service for internet applications [J].ACM SIGCOMM Computer Communication Review,2001,31(4):149-160.
[15] BREWER E A.Towards robust distributed systems (abstract)[C]∥Nineteenth ACM Symposium on Principles of Distributed Computing.ACM,2000:7.
[16] GRAY J.The transaction concept:virtues and limitations (invited paper)[C]∥International Conference on Very Large Data Bases.VLDB Endowment,1981:144-154.
[17] GRAY J,REUTER A.Transaction Processing:Concepts andTechniques[M].Morgan Kaufmann Publishers Inc.,1992.
[18] EASTLAKE R D,JONES P.US Secure Hash Algorithm 1(SHA1)[M].RFC Editor,2001.
[19] TZENG G H,HUANG J J.Multiple Attribute Decision Ma-king:Methods and Applications[J].Lecture Notes in Economics &Mathematical Systems,2011,375(4):1-531.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!