计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 140-144.doi: 10.11896/jsjkx.201100073

• 数据库&大数据&数据科学 • 上一篇    下一篇

关系型数据库向图数据库的转换方法

鄂海红, 韩鹏昊, 宋美娜   

  1. 北京邮电大学计算机学院(国家示范性软件学院) 北京100876
  • 收稿日期:2020-11-09 修回日期:2021-01-04 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 鄂海红(ehaihong@bupt.edu.cn)
  • 基金资助:
    国家重点研发计划课题(2018YFB1403501)

Conversion Method from Relational Database to Graph Database

E Hai-hong, HAN Peng-hao, SONG Mei-na   

  1. School of Computer Science (National Pilot Software Engineering School),Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2020-11-09 Revised:2021-01-04 Online:2021-10-15 Published:2021-10-18
  • About author:E Hai-hong,born in 1982,Ph.D,asso-ciate professor,is a member of China Computer Federation.Her main research interests include big data platform,cloud computing and microservice architecture.
  • Supported by:
    National Key Research and Development Program of China(2018YFB1403501).

摘要: 由于关系型数据库和图数据库存储模式的天然差别,将关系型数据库中的数据转存到图数据库的过程中,需解决对于关系的定义、节点唯一性以及保留原数据库约束信息的主要问题。针对上述问题,提出了一种关系型数据库向图数据库转换的方法。首先通过自定义或使用已有主键,并结合数据库表名的唯一性,解决了节点唯一性的问题;通过不同的配置方案,最大化保留了原关系型数据库的约束信息;然后提出了基于配置与中间表的边定义方法(Edge Definition Method based on Configuration and Intermediate Table,EDCIT),针对多种类型的数据库提供不同关系的映射方案,解决了转换过程中对于关系的定义。最终,通过对多个数据集进行实验,并使用Gremlin语句对转换后的数据进行测试,验证了转换后的数据具有完整性和可靠性。

关键词: Gremlin, Hugegraph, 关系型数据库, 跨库数据交换, 图数据库

Abstract: Due to the differences between the storage mode of relational database and graph database,during the process of transforming data in relational database to graph database,it is necessary to solve the main problems of edge definition,vertex uniqueness and retention of original database constraint information.To solve the above problems,a method of transforming relational database to graph database is proposed.Firstly,by customizing the existing primary key,combined with the uniqueness of the table name,the problem of ensuring the uniqueness of the vertex is solved;through different configuration schemes,the constraint information of the original relational database can be maximized.Then,the edge definition method based on configuration and intermediate table (EDCIT) method is proposed,it provides different edge mapping solutions for multiple types of databases and solves the definition of edges during the transformation.Finally,through experiments on multiple data sets,and using Gremlin statement to test the transformed data,it verifies the integrity and reliability of the transformed data.

Key words: Cross-database data exchange, Graph database, Gremlin, Hugegraph, Relational database

中图分类号: 

  • TP392
[1]IAN R,JIM W,EIFREM E.Graph Databases[M].O'ReillyMedia,Inc.:Cambridge,2015:12-20.
[2]NEEDHAM M,HODLER A E.Graph Algorithms[M].O'Reilly Media,Inc.:California,2019:5-8.
[3]PAUL S,MITRA A,KONER C.A Review on Graph Database and its Representation[C]//International Conference on Recent Advances in Energy-efficient Computing and Communication.2019:1-5.
[4]ZHAO P,SHOU L D,CHEN K,et al.Storage and Query Model for Localized Search on Temporal Graph Data[J].Computer Science,2019,46(10):186-194.
[5]NEO4J STAFF.The Database Model Showdown:An RDBMS vs.Graph Comparison[EB/OL].(2015-08-03) [2020-11-05].https://neo4j.com/blog/database-model-comparison.
[6]OZGUR C,COTO J,BOOTH D.A comparative study of net-work modeling using a relational database (eg Oracle,mySQL,SQL server) vs.Neo4j[C]//Conference Proceedings By Track.2017:156-165.
[7]MAGORZATA P W,RYKOWSKI D.Comparison of Rela-tional,Document and Graph Databases in the Context of the Web Application Development[C]//Information Systems Architecture and Technology:Proceedings of 36th International Conference on Information Systems Architecture and Technology.Springer International Publishing,2016:3-13.
[8]FOSIC I,ŠOLIC K.Graph database approach for data storing,presentation and manipulation[C]//2019 42nd International Convention on Information and Communication Technology,Electronics and Microelectronics.2019:1548-1552.
[9]SHOLICHAH R J,IMRONA M,ALAMSYAH A.Performance Analysis of Neo4j and MySQL Databases using Public Policies Decision Making Data[C]//2020 7th International Conference on Information Technology,Computer,and Electrical Enginee-ring (ICITACEE).2020:152-157.
[10]BATRA S,CHARU T.Comparative analysis of relational and graph databases[J].International Journal of Soft Computing and Engineering (IJSCE),2012,2:509-512.
[11]MUELLER W,IDZIASZEK P,GIERZ U,el al.Mapping and visualization of complex relational structures in the graph form using the Neo4j graph database[C]//Proceedings of Eleventh International Conference on Digital Image Processing.2019.
[12]UNAL Y,OGUZTUZUN H.Migration of data from relational database to graph database[C]//the 8th International Confe-rence.2018:1-5.
[13]DE VIRGILIO R,MACCIONI A,TORLONE R.R2G:a Tool for Migrating Relations to Graphs[C]//International Confe-rence on Extending Database Technology.2014:640-643.
[14]DE VIRGILIO R,MACCIONI A,TORLONE R.Converting relational to graph databases[C]//International Workshop on Graph Data Management Experiences and Systems.2013:1-6.
[15]ANZUM N.Systems for Graph Extraction from Tabular Data[D].Waterloo:University of Waterloo,2020.
[16]SERIN F,METE S,GUL M,et al.Mapping between relational database management systems and graph database for public transportation network[C]//International Research/Expert Conference.2018:209-212.
[17]linlin1989117.HugeGraph之Variables [EB/OL].(2020-10-14) [2020-11-7].https://blog.csdn.net/linlin1989117/article/details/109072676.
[18]thutmose.“JanusGraph与HugeGraph”图形数据库-技术选型-功能对比[EB/OL].(2019-03-25) [2020-11-07].https://blog.csdn.net/lovebyz/article/details/88800363.
[1] 梁静茹, 鄂海红, 宋美娜.
基于属性图模型的领域知识图谱构建方法
Method of Domain Knowledge Graph Construction Based on Property Graph Model
计算机科学, 2022, 49(2): 174-181. https://doi.org/10.11896/jsjkx.210500076
[2] 黄梅根, 刘川, 杜欢, 刘佳乐.
基于知识图谱的认知诊断模型及其在教辅中的应用研究
Research on Cognitive Diagnosis Model Based on Knowledge Graph and Its Application in Teaching Assistant
计算机科学, 2021, 48(6A): 644-648. https://doi.org/10.11896/jsjkx.200700163
[3] 赖欣, 曾纪炜.
几何类航空数据与关系型数据库映射转换研究
Study on Mapping Transformation from Geometric Aviation Data to Relational Database
计算机科学, 2020, 47(11A): 570-572. https://doi.org/10.11896/jsjkx.200400040
[4] 赵萍, 寿黎但, 陈珂, 陈刚, 吴晓凡.
面向局域检索的时变图数据存储与查询模型
Storage and Query Model for Localized Search on Temporal Graph Data
计算机科学, 2019, 46(10): 186-194. https://doi.org/10.11896/jsjkx.19100530C
[5] 潘明明,李丁丁,汤庸,刘海.
一种基于中间件的异构数据库融合访问方法及系统
Design and Implemention of Accessing Hybrid Database Systems Based on Middleware
计算机科学, 2018, 45(5): 163-167. https://doi.org/10.11896/j.issn.1002-137X.2018.05.027
[6] 杨德先,孙华,于炯,国冰磊.
一种基于MBRC值的关系型数据库负载能耗预测模型
Relational Database Energy Prediction Model Based on MBRC
计算机科学, 2017, 44(7): 161-166. https://doi.org/10.11896/j.issn.1002-137X.2017.07.029
[7] 姜人和,郑晓梅,朱晓倩,潘敏学,张天.
一种基于UML关系的Java代码库构造方法
Method of Java Code Repository Construction Based on UML Relationship
计算机科学, 2017, 44(11): 69-79. https://doi.org/10.11896/j.issn.1002-137X.2017.11.011
[8] .
空间数据库管理系统VISTA的强制访问控制设计

计算机科学, 2007, 34(10): 149-151.
[9] 史周军 叶晓俊.
基于元数据的对象关系映射研究

计算机科学, 2005, 32(5): 95-97.
[10] 傅瑞军 郑东.
企业信息资源整合的目录服务解决方案

计算机科学, 2004, 31(6): 99-101.
[11] 欧阳为民 蔡庆生.
关系型数据库中的归纳依赖关系

计算机科学, 1998, 25(3): 52-56.
[12] 詹舒波 张其善.
电子地图数据库存贮文件的设计

计算机科学, 1996, 23(3): 56-59.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!