计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 33-41.doi: 10.11896/jsjkx.240200001

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于细粒度缓存与学习型索引的LSM树键值存储系统性能优化

许睿达1, 李永坤1,2, 许胤龙1,2   

  1. 1 中国科学技术大学计算机科学与技术学院 合肥 230026
    2 安徽省高性能计算重点实验室 合肥 230026
  • 收稿日期:2024-02-01 修回日期:2024-06-24 出版日期:2025-02-15 发布日期:2025-02-17
  • 通讯作者: 李永坤(ykli@ustc.edu.cn)
  • 作者简介:(xuruida@mail.ustc.edu.cn)
  • 基金资助:
    国家自然科学基金面上项目(62172382)

Performance Optimization of LSM-tree Based Key-Value Storage System Based on Fine-grained Cache andLearned Index

XU Ruida1, LI Yongkun1,2, XU Yinlong1,2   

  1. 1 School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China
    2 Anhui Province Key Laboratory of High Performance Computing,Hefei 230026,China
  • Received:2024-02-01 Revised:2024-06-24 Online:2025-02-15 Published:2025-02-17
  • About author:XU Ruida,born in 2000,postgraduate.His main research interests include key-value stores and file systems.
    LI Yongkun,born in 1986,professor,is a member of CCF(No.33184M).His main research interests include parallel and distributed systems,key-value systems,memory systems,and cloud computing,etc.
  • Supported by:
    National Natural Science Foundation of China(62172382).

摘要: 在数据量飞速增长的大数据时代背景下,基于日志结构合并树的(Log-Structured Merge-Tree-based,LSM-Tree-based)键值存储系统因其优秀的灵活性与扩展性被广泛应用于NoSQL系统。但是,传统的LSM-Tree结构键值存储系统在查询数据时,因搜索多个SSTable引起的读放大问题会产生额外的I/O开销,影响系统性能。针对这一问题,提出了一种新型键值存储系统优化设计方案——FCLI-LSM。FCLI-LSM结合了细粒度键值对缓存和学习型索引的优化方法,旨在提升基于LSM-Tree结构的键值存储系统的查询性能。通过对数据访问热点的分析,FCLI-LSM对数据进行热、温、冷数据的三级分级。FCLI-LSM为热数据设计了基于键值分离的细粒度缓存机制,有效减少了读放大问题带来的额外I/O开销;此外,还设计了一种针对学习型索引的缓存亲和优化,进一步提高了存储系统对温数据的查询效率。实验结果表明,与现有的查询优化方案相比,FCLI-LSM能带来超过40%的平均查询时延下降以及超过1.7倍的系统吞吐率提升。

关键词: 大数据, 键值存储系统, 日志结构合并树, 学习型索引, 缓存

Abstract: In the context of the big data era where the amount of data is growing rapidly,log-structured merge-tree-based(LSM-Tree-based) key-value storage systems are widely deployed in many NoSQL systems because of their excellent flexibility and scalability.However,when querying data in the traditional LSM-Tree based key-value storage system,the read amplification problem,which is caused by searching through multiple SSTables,generates additional I/O overhead.In this paper,we present FCLI-LSM,a new optimization design for key-value storage systems.FCLI-LSM improves the performance of LSM-Tree-based key-value storage by combining the optimization methods of fine-grained key-value pair caching and learned index.By analyzing data access hotspots.FCLI-LSM implements three-level classification of hot,warm,and cold key-value data.FCLI-LSM designs a fine-grained caching mechanism for hot data based on key-value separation,effectively reducing the additional I/O overhead caused by the read amplification problem.In addition,FCLI-LSM also designs a cache affinity optimization for learned index,which further improves the query performance of storage system for warm key-value data.Experimental results show that,compared with existing read optimization solutions,FCLI-LSM can reduce average read latency by more than 40% and increase system throughput by more than 1.7 times.

Key words: Big data, Key-Value storage, LSM tree, Learned index, Caching

中图分类号: 

  • TP311
[1]REINSEL D,GANTZ J,RYDNING J.Data age 2025:the digitization of the world from edge to core[J/OL].https://www.seagate.com/files/www-content/our-story/trends/files/idc-sea-gate-dataage-whitepaper.pdf.
[2]O'NEIL P,CHENG E,GAWLICK D,et al.The log-structured merge-tree(LSM-tree)[J].Acta Informatica,1996,33:351-385.
[3]GHEMAWAT S,DEAN J.LevelDB[EB/OL].https://git-hub.com/google/leveldb.
[4]FACEBOOK.RocksDB[EB/OL].https://github.com/facebook/rocksdb.
[5]LU L,PILLAI T S,GOPALAKRISHNAN H,et al.Wisckey:Separating keys from values in ssd-conscious storage[J].ACM Transactions on Storage(TOS),2017,13(1):1-28.
[6]KRASKA T,BEUTEL A,CHI E H,et al.The case for learned index structures[C]//Proceedings of the 2018 International Conference on Management of Data.2018:489-504.
[7]DAI Y,XU Y,GANESAN A,et al.From {WiscKey} to Bourbon:A Learned Index for {Log-Structured} Merge Trees[C]//14th USENIX Symposium on Operating Systems Design and Implementation(OSDI 20).2020:155-171.
[8]ZHANG J,WANG F,DONG C.HaLSM:A Hotspot-awareLSM-tree based Key-Value Storage Engine[C]//2022 IEEE 40th International Conference on Computer Design(ICCD).IEEE,2022:179-186.
[9]CAO Z,DONG S,VEMURI S,et al.Characterizing,modeling,and benchmarking {RocksDB}{Key-Value} workloads at facebook[C]//18th USENIX Conference on File and Storage Technologies(FAST 20).2020:209-223.
[10]ATIKOGLU B,XU Y,FRACHTENBERG E,et al.Workloadanalysis of a large-scale key-value store[C]//Proceedings of the 12th ACM Sigmetrics/Performance Joint International Confe-rence on Measurement and Modeling of Computer Systems.2012:53-64.
[11]COOPER B F,SILBERSTEIN A,TAM E,et al.Benchmarking cloud serving systems with YCSB[C]//Proceedings of the 1st ACM Symposium on Cloud Computing.2010:143-154.
[12]COMER D.Ubiquitous B-tree[J].ACM Computing Surveys,1979,11(2):121-137.
[13]SIVASUBRAMANIAN S.Amazon dynamoDB:a seamlesslyscalable non-relational database service[C]//Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data.2012:729-730.
[14]CHANG F,DEAN J,GHEMAWAT S,et al.Bigtable:A distri-buted storage system for structured data[J].ACM Transactions on Computer Systems(TOCS),2008,26(2):1-26.
[15]LAKSHMAN A,MALIK P.Cassandra:a decentralized struc-tured storage system[J].ACM SIGOPS Operating Systems Review,2010,44(2):35-40.
[16]PUGH W.Skip lists:a probabilistic alternative to balanced trees[J].Communications of the ACM,1990,33(6):668-676.
[17]DGRAPH.BadgerDB[EB/OL].https://github.com/dgraph-io/badger.
[18]PINGCAP.Titan[EB/OL].https://github.com/tikv/titan.
[19]GALAKATOS A,MARKOVITCH M,BINNIG C,et al.Fiting-tree:A data-aware index structure[C]//Proceedings of the 2019 International Conference on Management of Data.2019:1189-1206.
[20]DING J,MINHAS U F,YU J,et al.ALEX:an updatable adaptive learned index[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:969-984.
[21]XIE Q,PANG C,ZHOU X,et al.Maximum error-boundedpiecewise linear representation for online stream approximation[J].The VLDB Journal,2014,23:915-937.
[22]OPENSTREETMAP.OpenStreetMap[EB/OL].https://www.openstreetmap.org/.
[23]TENG D,GUO L,LEE R,et al.LSbM-tree:Re-enabling buffer caching in data management for mixed reads and writes[C]//2017 IEEE 37th International Conference on Distributed Computing Systems(ICDCS).IEEE,2017:68-79.
[24]WU F,YANG M H,ZHANG B,et al.{AC-Key}:Adaptive Ca-ching for {LSM-based}{Key-Value} Stores[C]//2020 USENIX Annual Technical Conference(USENIX ATC 20).2020:603-615.
[25]YANG L,WU H,ZHANG T,et al.Leaper:A learned prefe-tcher for cache invalidation in LSM-tree based storage engines[J].Proceedings of the VLDB Endowment,2020,13(12):1976-1989.
[26]GALAKATOS A,MARKOVITCH M,BINNIG C,et al.Fiting-tree:A data-aware index structure[C]//Proceedings of the 2019 International Conference on Management of Da Data.2019:1189-1206.
[27]FERRAGINA P,VINCIGUERRA G.The PGM-index:a fully-dynamic compressed learned index with provable worst-case bounds[J].Proceedings of the VLDB Endowment,2020,13(8):1162-1175.
[28]KIPF A,MARCUS R,VAN RENEN A,et al.RadixSpline:asingle-pass learned index[C]//Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management.2020:1-5.
[29]DAYAN N,ATHANASSOULIS M,IDREOS S.Monkey:Optimal navigable key-value store[C]//Proceedings of the 2017 ACM International Conference on Management of Da Data.2017:79-94.
[30]LI Y,TIAN C,GUO F,et al.{ElasticBF}:Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large {Key-Value} Stores[C]//2019 USENIX Annual Technical Conference(USENIX ATC 19).2019:739-752.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!