Computer Science ›› 2025, Vol. 52 ›› Issue (1): 151-159.doi: 10.11896/jsjkx.231200073

• Database & Big Data & Data Science • Previous Articles     Next Articles

Application-aware Disaggregated Storage Design for Remote Memory Graph Database

LI Chunyu1, DENG Long1, LI Yongkun1,2, XU Yinlong1,2   

  1. 1 School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China
    2 Anhui Province Key Laboratory of High Performance Computing,Hefei 230026,China
  • Received:2023-12-11 Revised:2024-05-13 Online:2025-01-15 Published:2025-01-09
  • About author:LI Chunyu,born in 1999,postgraduate.Her main research interests include graph database and disaggregated me-mory architecture.
    LI Yongkun,born in 1986,professor,is a member of CCF(No.33184M).His main research interests include I/O optimization for data-intensive computing and storage systems,parallel and distributed systems,key-value systems,and cloud computing,etc.
  • Supported by:
    National Natural Science Foundation of China(62172382).

Abstract: Graph data is becoming more widespread across various applications due to its features to represent multiple entity types and rich associated relationships.For users of graph databases,efficient graph query service is crucial to ensure system performance.As the amount of data grows,the single-machine graph database is difficult to meet the demand of storing all the data in memory,while distributed graph databases,which spread the data across the memory of multiple machines,face challenges in scalability and resource utilization rate.A new solution to these challenges is the introduction of RDMA-based remote memory systems.These systems separate the tasks of computing and storing graph data,offering a more flexible way to use memory.Yet,a big challenge in current solutions is how to ensure the performance of graph query when using remote memory.This study takes a close look at the challenges that come up when building remote memory graph databases on general-purpose far memory platforms,which use remote memory transparently.It suggests a new approach,where the design of the remote memory graph database is aware of how it’s being used.This design method creates a storage model that understands the different types of property graph data and how it’s accessed.Specifically,the study explains how to make sure the data is arranged and accessed in the best way.Experimental results show that when the local memory is limited,the awareness method outperforms the transparent approach,giving a 12x improvement in graph query performance.

Key words: Graph query, Graph database, Graph storage, Remote memory, Property graph model

CLC Number: 

  • TP311
[1]Neo4J,Inc.Neo4J[OL].2007[2023-11-21].https://neo4j.com/.
[2]JanusGraph.JanusGraph[OL].2017[2023-11-21].https://janusgraph.org/
[3]DEUTSCH A,XU Y,WU M,et al.Tigergraph:A native MPP graph database[J].arXiv:1901.08248,2019.
[4]WU M,YI X,YU H,et al.Nebula Graph:An open source distributed graph database[J].arXiv:2206.07278,2022.
[5]LIU Q,LI Y,DUAN H,et al.Knowledge Graph Construction Techniques[J].Journal of Computer Research and Development,2016,53(3):582-600.
[6]SU L,QIN X,ZHANG Z,et al.Banyan:a scoped dataflow engine for graph query service[J].arXiv:2202.12530,2022.
[7]SAHU S,MHEDHBI A,SALIHOGLU S,et al.The ubiquity of large graphs and surprising challenges of graph processing:extended survey[J].The VLDB journal,2020,29:595-618.
[8]CHEN H,WU B,DENG S,et al.High performance distributed OLAP on property graphs with grasper[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:2705-2708.
[9]BURAGOHAIN C,RISVIK K M,BRETT P,et al.A1:A distributed in-memory graph database[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:329-344.
[10]EWAIS M,CHOW P.Disaggregated Memory in the Datacenter:A Survey[J].IEEE Access,2023,11:20688-20712.
[11]WANG C,QIAO Y,MA H,et al.Canvas:Isolated and Adaptive Swapping for {Multi-Applications} on Remote Memory[C]//20th USENIX Symposium on Networked Systems Design and Implementation(NSDI 23).2023:161-179.
[12]RUAN Z,SCHWARZKOPF M,AGUILERA M K,et al.{AIFM}:{High-Performance},{Application-Integrated} far memory[C]//14th USENIX Symposium on Operating Systems Design and Implementation(OSDI 20).2020:315-332.
[13]QIAO Y,WANG C,RUAN Z,et al.Hermit:{Low-Latency},{High-Throughput},and Transparent Remote Memory via {Feedback-Directed} Asynchrony[C]//20th USENIX Symposium on Networked Systems Design and Implementation(NSDI 23).2023:181-198.
[14]AMARO E,BRANNER-AUGMON C,LUO Z,et al.Can farmemory improve job throughput?[C]//Proceedings of the Fifteenth European Conference on Computer Systems.2020:1-16.
[15]GU J,LEE Y,ZHANG Y,et al.Efficient memory disaggregation with infiniswap[C]//14th USENIX Symposium on Networked Systems Design and Implementation(NSDI 17).2017:649-667.
[16]ANGLES R.The Property Graph Database Model[C]//AMW.2018.
[17]CHEN H,LI C,FANG J,et al.Grasper:A high performance distributed system for OLAP on property graphs[C]//Procee-dings of the ACM Symposium on Cloud Computing.2019:87-100.
[18]BESTA M,GERSTENBERGER R,PETER E,et al.Demystifying graph databases:Analysis and taxonomy of data organization,system designs,and graph queries[J].ACM Computing Surveys,2023,56(2):1-40.
[19]ANGLES R,ANTAL J B,AVERBUCH A,et al.The LDBC social network benchmark[J].arXiv:2001.02299,2020.
[20]AGUILERA M K,KEETON K,NOVAKOVIC S,et al.Designing far memory data structures:Think outside the box[C]//Proceedings of the Workshop on Hot Topics in Operating Systems.2019:120-126.
[21]LI C,CHEN H,ZHANG S,et al.ByteGraph:a high-perfor-mance distributed graph database in ByteDance[J].Proceedings of the VLDB Endowment,2022,15(12):3306-3318.
[22]CHEN H,LI C,ZHENG C,et al.G-tran:a high performance distributed graph database with a decentralized architecture[J].Proceedings of the VLDB Endowment,2022,15(11):2545-2558.
[23]WEI X,SHI J,CHEN Y,et al.Fast in-memory transaction processing using RDMA and HTM[C]//Proceedings of the 25th Symposium on Operating Systems Principles.2015:87-104.
[24]RODRIGUEZ M A.The gremlin graph traversal machine and language(invited talk)[C]//Proceedings of the 15th Symposium on Database Programming Languages.2015:1-10.
[25]KALIA A,KAMINSKY M,ANDERSEN D G.Design guide-lines for high performance {RDMA} systems[C]//2016 USENIX Annual Technical Conference(USENIX ATC 16).2016:437-450.
[26]DRAGOJEVIĆ A,NARAYANAN D,CASTRO M,et al.{FaRM}:Fast remote memory[C]//11th USENIX Symposium on Networked Systems Design and Implementation(NSDI 14).2014:401-414.
[27]BESTA M,GERSTENBERGER R,FISCHER M,et al.The Graph Database Interface:Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2023:1-18.
[28]WANG J,LI C,WANG T,et al.Excavating the potential of graph workload on rdma-based far memory architecture[C]//2022 IEEE International Parallel and Distributed Processing Symposium(IPDPS).IEEE,2022:1029-1039.
[29]WANG J,LI C,LIU Y,et al.Fargraph+:Excavating the paral-lelism of graph processing workload on RDMA-based far memory system[J].Journal of Parallel and Distributed Computing,2023,177:144-159.
[30]ZAHKA D,GAVRILOVSKA A.FAM-Graph:Graph analytics on disaggregated memory[C]//2022 IEEE International Parallel and Distributed Processing Symposium(IPDPS).IEEE,2022:81-92.
[1] MAO Zhixiong, LIU Zhinan, GAO Xuning, WANG Mengxiang, GONG Shufeng, ZHANG Yanfeng. Power-PCSR:An Efficient Dynamic Graph Storage Structure for Power-law Graphs [J]. Computer Science, 2024, 51(8): 56-62.
[2] ZOU Chunling, ZHU Zhengzhou. Fusion Model of Housekeeping Service Course Recommendation Based on Knowledge Graph [J]. Computer Science, 2024, 51(2): 47-54.
[3] LIANG Jing-ru, E Hai-hong, Song Mei-na. Method of Domain Knowledge Graph Construction Based on Property Graph Model [J]. Computer Science, 2022, 49(2): 174-181.
[4] HUANG Mei-gen, LIU Chuan, DU Huan, LIU Jia-le. Research on Cognitive Diagnosis Model Based on Knowledge Graph and Its Application in Teaching Assistant [J]. Computer Science, 2021, 48(6A): 644-648.
[5] E Hai-hong, HAN Peng-hao, SONG Mei-na. Conversion Method from Relational Database to Graph Database [J]. Computer Science, 2021, 48(10): 140-144.
[6] XU Wen, SONG Wen-ai, FU Li-zhen, LV Wei. Distributed Subgraph Matching Algorithm for Large Scale Graph Data [J]. Computer Science, 2019, 46(4): 28-35.
[7] ZHAO Ping, SHOU Li-dan, CHEN Ke, CHEN Gang, WU Xiao-fan. Storage and Query Model for Localized Search on Temporal Graph Data [J]. Computer Science, 2019, 46(10): 186-194.
[8] JIANG Ren-he, ZHENG Xiao-mei, ZHU Xiao-qian, PAN Min-xue and ZHANG Tian. Method of Java Code Repository Construction Based on UML Relationship [J]. Computer Science, 2017, 44(11): 69-79.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!