计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 1-15.doi: 10.11896/jsjkx.220900178

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于LSM树的键值存储系统技术研究综述

吕萌, 华文镝, 谢平   

  1. 1 青海师范大学计算机学院 西宁 810016
    2 青海师范大学网络信息管理中心 西宁 810016
    3 青海省物联网重点实验室 西宁 810008
    4 省部共建藏语智能信息处理及应用国家重点实验室 西宁 810008
    5 高原科学与可持续发展研究院 西宁 810016
  • 收稿日期:2022-09-19 修回日期:2023-02-26 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 谢平(xieping@qhnu.edu.cn)
  • 作者简介:(2256526997@qq.com)
  • 基金资助:
    国家自然科学基金(61762075);青海省科技厅重点研发与转化计划项目(2021-GX-112)

Key Value Storage Technology Based on LSM-tree:A Survey

LYU Meng, HUA Wendi, XIE Ping   

  1. 1 School of Computer Science,Qinghai Normal University,Xining 810016,China
    2 Network Information Management Center,Qinghai Normal University,Xining 810016,China
    3 Key Laboratory of Internet of Things of Qinghai Province,Xining 810008,China
    4 State Key Laboratory of Tibetan Intelligent Information Processing and Application,Xining 810008,China
    5 Academy of Plateau Science and Sustainability,Xining 810016,China
  • Received:2022-09-19 Revised:2023-02-26 Online:2023-08-15 Published:2023-08-02
  • About author:LYU Meng,born in 1998,postgra-duate,is a member of China Computer Federation.Her main research interests include network storage and so on.
    XIE Ping,born in 1979.Ph.D,professor,is a member of China Computer Federation.His main research interests include distributed file system,network storage,fault-tolerant storage and new storage device.
  • Supported by:
    National Natural Science Foundation of China(61762075) and Key R & D and Transformation Project of Qinghai Province(2021-GX-112).

摘要: 键值存储是数据库最简单的组织形式。在数据密集型的应用场景中,键值存储系统发挥着关键的作用。随着对及时数据分析需求的增加,良好的系统性能变得越来越重要。目前大多数键值存储系统的存储引擎都是日志结构合并树(Log-Structured Merge Tree,LSM树)。因具有卓越的写性能,LSM树被广泛应用于写密集型的场景和现代NoSQL系统的存储层。与传统的B树相比,LSM树采用顺序写入的访问模式,并使用内存缓冲区来批处理新的写入线程,因此LSM树具有更大的写优势。然而,数据的重复读写和不必要的压缩操作导致了LSM树的读写放大问题,从而严重影响了系统的性能,尤其在数据密集型的应用场景。如今,研究人员做了大量工作来缓解这些问题,文中研究了影响LSM树性能的各个因素,搜集了大量提升基于LSM树的键值系统性能的文献,并对其加以整理和分类,讨论它们的优势和权衡,使读者可以了解基于LSM树的存储技术及其优化策略,最后调查了几个具有代表性的基于LSM树的键值存储技术并讨论了潜在的未来研究方向。

关键词: LSM树, NoSQL, 存储管理, 键值系统, 数据检索

Abstract: Key-value storage is the simplest form of database organization and it plays a key role in data-intensive application scenarios.With the increasing demand for timely data analysis,good system performance becomes more and more important.At present,the storage engine of most key-value storage systems is the log-structured merge tree(LSM-tree).Because of its excellent write performance,the LSM-tree is widely used in the write intensive scenes and the storage layer of modern NoSQL system.Compared to the traditional B-tree,LSM-tree adopts sequential write access mode.At the same time,it uses memory buffer to batch new write threads,so it has greater write advantages.Nevertheless,repeated reading and writing of data and unnecessary compression operations lead to the problem of read and write amplification of the LSM-tree.Finally,these problems seriously affect the performance of the system,especially in the data-intensive application scenarios.Nowadays,researches have make great efforts to solve the problems.Firstly,this paper investigates various factors that affect the performance of the LSM-tree,collects a lot of literature on improving the performance of LSM tree-based key-value systems,organizes and categorizes them.Then it discusses their advantages and tradeoffs to enable readers to understand LSM tree-based storage technologies and their optimization strategies.Finally,several representative LSM tree-based key-value storage technologies are surveyed and some potential future research directions are discussed.

Key words: LSM-tree, NoSQL, Storage management, Key-value system, Data retrieval

中图分类号: 

  • TP393
[1]DECANDIA G,HASTORUN D,JAMPANI M,et al.Dynamo:amazon's highly available key-value store[C]//Symposium on Operating Systems Principles.2007:205-220.
[2]LAKSHMAN A,MALIK P.CASSANDRA-A DecentralizedStructured Storage System[J].Operating Systems Review,2010,44(2):35-40.
[3]CHODOROW K,DIROLF M.MongoDB-The Definitive Gui-de:Powerful and Scalable Data Storage[M]//DBLP.2010.
[4]O'NEIL P,CHENG E,GAWLICK D,et al.The log-structured merge-tree(LSM-tree)[J].Acta Informatica,1996,33(4):351-385.
[5]HUA W D,GAO Y,LYU M,et al.Research on Bloom Filter:a survey [J].Journal of Computer,2022,42(6):1729-1747.
[6]DEBNATH B,HAGHDOOST A,KADAV A,et al.Revisiting hash table design for phase change memory[J].ACM SIGOPS Operating Systems Review,2015,49(2):18-26.
[7]LI J Y,YUE Y L,WANG W P:GHStore:A High Performance Global Hash Based Key-Value Store[C]//27th DASFAA 2022:Virtual Event-Part I.2022:493-508.
[8]KE Z M,LI Y Z,CHANG D W.Dual-KV:Improving Perfor-mance of Key-value Caches on Multilevel Cell Non-volatile Memory[C]//50th International Conference on Parallel Processing Workshop.2021:1-9.
[9]XIA F,JIANG D J,XIONG J.Analysis of factors affecting the performance of nonvolatile memory system [J].Computer Research and Development,2014(S1):25-31.
[10]DAYAN N,IDREOS S.Dostoevsky:Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging[C]//Proceedings of the 2018 International Conference on Management of Data.2018:505-520.
[11]PUGH W.Skip lists:a probabilistic alternative to balanced trees[J].Communications of the ACM,1990,33(6):668-676.
[12]CHEN L,CAREY M J.LSM-based storage techniques:a survey[J].The VLDB Journal,2020,29(1):393-418.
[13]AMUR H,ANDERSEN D G,KAMINSKY M,et al.Design of a Write-Optimized Data Store[J/OL].https://repository.gatech.edu/server/api/core/bitstreams/79429d60-38b7-4885-9a88-c8211c7acba8/content.
[14]TING Y,WAN J G,PING H,et al.Building Efficient Key-Value Stores via a Lightweight Compaction Tree[J].ACM Transactions on Storage,2017,13(4):29:1-29:28.
[15]RAJU P,KADEKODI R,CHIDAMBARAM V,et al.Pebblesdb:Building key-value stores using fragmented log-structured merge trees[C]//Proceedings of the 26th Symposium on Ope-rating Systems Principles.2017:497-514.
[16]BLOOM B H.Space/time trade-offs in hash coding with allowable errors[J].Communications of ACM,1970,13(7):422-426.
[17]PAN F F,YUE Y L,XIONG J.dCompaction:Speeding up compaction of the lsm-tree via delayed compaction[J].Journal of Computer Science and Technology,2017,32(1):41-54.
[18]MEI F,CAO Q,JIANG H,et al.SifrDB:A Unified Solution for Write-Optimized Key-Value Stores in Large Datacenter[C]//ACM Symposium on Cloud Computing.2018:477-489.
[19]YUE Y L,HE B S,LI Y Z,et al.Building an Efficient Put-Intensive Key-Value Store with Skip-Tree[J].IEEE Transactions on Parallel and Distributed Systems,2017,28(4):961-973.
[20]BALMAU O,DIDONA D,GUERRAOUI R,et al.TRIAD:Crea-ting Synergies Between Memory,Disk and Log in Log Structured Key-Value Stores[C]//2017 USENIX Annual Technical Conference(USENIX ATC 17).2017:363-375.
[21]SEARS R,RAMAKRISHNAN R.bLSM:A general purpose log structured merge tree[C]//Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data.2012:217-228.
[22]TENG D,GUO L,LEE R,et al.A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads[J].ACM Transactions on Storage,2018,14(2):15:1-15:26.
[23]PRADEEP S,RICHARD P.SPILLANE,RAVIKANT M,et al.Building workload-independent storage with VT-trees[C]//FAST.2013:17-30.
[24]BENDER M A,FARACH-C M,JOHNSON R,et al.Don'tThrash:How to Cache Your Hash on Flash[C]//Proceedings of the VLDB Endowment.2012:1627-1637.
[25]VARDOULAKISM,SALOUSTROS G,PILAR G F,et al.Tebis:index shipping for efficient replication in LSM key-value stores[C]//EuroSys.2022:85-98.
[26]CHAI Y P,CHAI Y F,WANG X,et al.Adaptive Lower-Level Driven Compaction to Optimize LSM-Tree Key-Value Stores[J].IEEE Transactions on Knowledge and Data Engineering,2022,34(6):2595-2609.
[27]YANG L,WU H,ZHANG T,et al.Leaper:A learned prefe-tcher for cache invalidation in LSM-tree based storage engines[C]//Proceedings of the VLDB Endowment.2020:1976-1989.
[28]CHANDRAMOULI B,PRASAAD G,KOSSMANN D,et al.FASTER:A concurrent key-value store with in-place updates[C]//The 2018 International Conference.Houston:ACM,2018:275-290.
[29]LU L Y,PILLAI T S,GOPALAKRISHNAN H,et al.Wisc-Key:Separating Keys from Values in SSD-Conscious Storage[J].ACM Trans. on Storage,2017,13(1):5:1-5:28.
[30]LI Y,CHAN H H W,LEE P P C,et al.Enabling Efficient Updates in KV Storage via Hashing:Design and Performance Eva-luation[J].ACM Transactions on Storage,2019,15(3):20:1-20:29.
[31]HSIEH J W,KUO T W,CHANG L P.Efficient identification ofhot data for flash memory storage systems[J].ACM Transactions on Storage,2006,2(1):22-40.
[32]WANG P,SUN G Y,JIANG S,et al.An efficient design and implementation of LSM-tree based key-value store on open-channel SSD[C]//European Conference on Computer Systems.2014.16:1-16:14.
[33]CHEN H,RUAN C Y,LI C,et al.SpanDB:A Fast,Cost-Effective LSM-tree Based KV Store on Hybrid Storage[C]//USENIX Conference on File and Storage Technologies.2021:17-32.
[34]WANG Y Y,WEI H C,CHAI Y P.Performance optimization of LSM tree key value storage system based on ssd-smr hybrid storage [J].Computer Science,2018,45(7):61-65.
[35]WU H Y.Research on persistent key value storage systembased on dram-nvm hybrid memory [D].Wuhan:Huazhong University of Science and Technology,2019.
[36]ROB D.pmemKV[OL].[2022-05-27].https://github.com/pmem/pmemkv.
[37]ISMAIL O,JOHAN L,ANISOARA N,et al.FPTree:A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory[C]//ACM SIGMOD Conference.2016:371-386.
[38]CHEN S,JIN Q.Persistent b+-trees in non-volatile main me-mory[C]//Proceedings of the VLDB Endowment.2015:786-797.
[39]KANNAN S,BHAT N,GAVRILOVSKA A,et al.Redesigning LSMs for Nonvolatile Memory with NoveLSM[C]//USENIX Annual Technical Conference.2018:993-1005.
[40]KAIYRAKHMET O,LEE S,NAM B,et al.SLM-DB:single-level key-value store with persistent memory[C]//2019 Confe-rence on File and Storage Technologies.2019:191-205.
[41]ZHAN L,ZHANG Y,YU K.Design and Implementation ofSCM and SSD based Hybrid Key-Value Store[C]//Proceedings of the 2019 International Conference on Artificial Intelligence and Computer Science.2019:566-572.
[42]YAO T,ZHANG Y,WAN J,et al.MatrixKV:Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM[C]//2020 USENIX Annual Technical Conference (USENIX ATC 20).2020:17-31.
[43]WU X B,XU Y H,SHAO Z L,et al.LSM-trie:An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items[C]//USENIX Annual Technical Conference.2015:71-82.
[44]MEI F.Research on optimization method of large-scale key va-lue storage system based on log structure merging tree [D].Wuhan:Huazhong University of Science and Technology,2019.
[45]REN K,ZHENG Q,ARULRAJ J,et al.SlimDB:A space-effi-cient key-value storage engine for semi-sorted data[C]//Proceedings of the VLDB Endowment.2017:2037-2048.
[46]FAN B,ANDERSEN D G,KAMINSKY M,et al.Cuckoo filter:Practically better than bloom[C]//Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies.2014:75-88.
[47]SARKAR S,PAPON T I,STARATZIS D,et al.Lethe:A tunable delete-aware LSM engine[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:893-908.
[48]CHANG F.Bigtable:A Distributed Storage System for Struc-tured Data[C]//7th USENIX Symposium on Operating Systems Design and Implementation(OSDI).2006.
[49]CAULFIELD A M,MOLLOV T I,EISNER L A,et al.Providing Safe,User Space Access to Fast,Solid State Disks[C]//International Conference on Architectural Support for Programming Languages and Operating Systems.2012:387-400.
[50]ZHU Q.Research on hybrid storage system [D].Shanghai:Shanghai Jiaotong University,2012.
[51]OUSTERHOUT J K,AGRAWAL P,ERICKSON D,et al.The case for RAMClouds:Scalable high-performance storage entirely in DRAM[J].ACM SIGOPS Operating Systems Review,2009,43(4):92-105.
[52]LIM H,FAN B,ANDERSEN D G,et al.SILT:A memory-efficient,high-performance key-value store[C]//Proceedings of the 23rd ACM Symposium on Operating Systems Principles.2011.
[53]YANG J,YUE Y,VINAYAK R.Segcache:a memory-efficient and scalable in-memory key-value cache for small objects[C]//18th USENIX Symposium on Networked Systems Design and Implementation(NSDI 21).2021:503-518.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!