计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 42-47.doi: 10.11896/jsjkx.231200021
付雄, 宋朝阳, 王俊昌, 邓松
FU Xiong, SONG Zhaoyang, WANG Junchang, DENG Song
摘要: 随着大数据技术、云计算、计算机技术和网络技术的迅猛发展,互联网数据呈爆炸性增长,海量数据的高效存储成为当前互联网技术亟待解决的问题。然而,传统的多副本冗余机制导致了巨大的存储成本,引起了研究者们对新型存储解决方案的关注。在这一背景下,提出了一种基于擦除编码和副本复制的分布式混合存储策略。该策略根据数据特性,对热数据采用副本复制以确保高可靠性和性能,而对冷数据则采用擦除编码以提高存储利用率。基于牛顿冷却定律将数据文件划分为热文件和冷文件,并引入一种自适应的数据温度识别及冷热数据自适应动态分配算法,使系统能够在运行时自动调整冷热数据的比例,然后根据实时数据冷热情况智能调整数据的存储策略,体现了系统在动态环境下的自适应性。其不仅增强了系统对动态工作负载的适应能力,也为提高分布式存储系统在实际应用中的效率和灵活性提供了新的范式。这一创新点在学术和实践层面都具有重要的推动意义。同时,通过仿真实验验证了该策略的有效性和可用性,其为分布式存储系统的优化提供了新的思路。
中图分类号:
[1]CHOU R A,KLIEWER J.Secure distributed storage:Optimal trade-off between storage rate and privacy leakage[C]//2023 IEEE International Symposium on Information Theory(ISIT).IEEE,2023:1324-1329. [2]NAEEM M,JAMAL T,DIAZ-MARTINEZ J,et al.Trends and future perspective challenges in big data[C]//Advances in Intelligent Data Analysis and Applications:Proceeding of the Sixth Euro-China Conference on Intelligent Data Analysis and Applications,15-18 October 2019,Arad,Romania,Springer Singapore,2022:309-325. [3]GHAZI M R,GANGODKAR D.Hadoop,MapReduce andHDFS:a developers perspective[J].Procedia Computer Science,2015,48:45-50. [4]RYBINTSEV V O.Optimizing the parameters of the Lustre-file-system-based HPC system for reverse time migration[J].The Journal of Supercomputing,2020,76:536-548. [5]WANG Y,YE M,HE Q,et al.Ceph storage system node selection method based on software-defined network and multi-attribute decision-making [J].Journal of Computer Science,2019,42(2):93-108. [6]XIA Y,WANG Y.Fault-tolerant selection algorithm of nodes in Ceph storage system [J].Journal of Guilin University of Electronic Science and Technology,2022,42(5):384-390. [7]BALAJI S B,KRISHNAN M N,VAJHA M,et al.Erasure co-ding for distributed storage:An overview[J].Science China Information Sciences,2018,61:1-45. [8]CADAMBE V R,LYU S.Brief Announcement:CausalEC:ACausally Consistent Data Storage Algorithm based on Cross-Object Erasure Coding[C]//Proceedings of the 2023 ACM Symposium on Principles of Distributed Computing.2023:374-377. [9]SHIN D J,KIM J J.Cache-Based Matrix Technology for Effi-cient Write and Recovery in Erasure Coding Distributed File Systems[J].Symmetry,2023,15(4):872. [10]DING Y,NIU C,WU F,et al.Federated submodel optimization for hot and cold data features[J].Advances in Neural Information Processing Systems,2022,35:1-13. [11]LIU J,FAN X,WU Y,et al.HoaKV:High-Performance KV Store Based on the Hot-Awareness in Mixed Workloads[J].Electronics,2023,12(15):3227. [12]YE X,ZHAI Z,LI X.Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data[J].Tehnicˇki Vjesnik,2020,27(2):368-373. [13]CHEN H,ZHANG H,DONG M,et al.Efficient and available in-memory KV-store with hybrid erasure coding and replication[J].ACM Transactions on Storage(TOS),2017,13(3):1-30. [14]HSU Y F,IRIE R,MURATA S,et al.A novel automated cloud storage tiering system through hot-cold data classification[C]//2018 IEEE 11th International Conference on Cloud Computing(CLOUD).IEEE,2018:492-499. [15]LI Z,XIAO C.ER-Store:A Hybrid Storage Mechanism with Erasure Coding and Replication in Distributed Database Systems[J].Scientific Programming,2021,2021:1-13. [16]CHANG C H,WENG J Y,YEN N Y,et al.Using the Ceph File System and RADOS Gateway to Construct an Integrated Shared Storage[J].Human-centric Computing and Information Sciences,2024,14. [17]MARUYAMA S,MORIYA S.Newton's Law of Cooling:Follow up and exploration[J].International Journal of Heat and Mass Transfer,2021,164:120544. [18]PATIL D P,PATIL S A,PATIL K J.Newton's law of cooling by Emad-Falih transform[J].International Journal of Advances in Engineering and Management,2022,4(6):1515-1519. [19]DA SILVA S L E F.Newton's cooling law in generalised statistical mechanics[J].Physica A:Statistical Mechanics and its Applications,2021,565:125539. [20]LIN Y,SHEN H.Eafr:An energy-efficient adaptive file replication system in data-intensive clusters[J].IEEE Transactions on Parallel and Distributed Systems,2016,28(4):1017-1030. [21]HE Q,ZHANG F,BIAN G,et al.File block multi-replica management technology in cloud storage[J].Cluster Computing,2023:1-20. [22]LLOPIS P,BLAS J G,ISAILA F,et al.Survey of energy-efficient and power-proportional storage systems[J].The Computer Journal,2014,57(7):1017-1032. [23]QIU N,HU X,WANG P,et al.Research on data cluster storage optimization strategy of consistent hashing [J].Information and Control,2016,45(6):747-752. [24]ZHANG H,LIU S,TANG D,et al.Low repair cost erasure co-ding in distributed storage systems [J].Computer Applications,2020,40(10):2942. [25]ADAMOU A,EGLOFF M,PICCD D.Enabling Ontology-Based Data Access to Project Gutenberg[C]//Workshop on Humanities in the Semantic Web.2020:21-32. [26]REHMAN A U,AGUIAR R L,BARRACA J P.Fault-tolerance in the scope of cloud computing[J].IEEE Access,2022,10:63422-63441. |
|