计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 174-181.doi: 10.11896/jsjkx.210500076

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于属性图模型的领域知识图谱构建方法

梁静茹, 鄂海红, 宋美娜   

  1. 北京邮电大学计算机学院(国家示范性软件学院) 北京100876
  • 收稿日期:2021-05-12 修回日期:2021-07-14 出版日期:2022-02-15 发布日期:2022-02-23
  • 通讯作者: 鄂海红(ehaihong@bupt.edu.cn)
  • 作者简介:liangjingru@bupt.edu.cn
  • 基金资助:
    国家重点研发计划(2018YFB1403501)

Method of Domain Knowledge Graph Construction Based on Property Graph Model

LIANG Jing-ru, E Hai-hong, Song Mei-na   

  1. School of Computer Science (National Pilot Software Engineering School),Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2021-05-12 Revised:2021-07-14 Online:2022-02-15 Published:2022-02-23
  • About author:LIANG Jing-ru,born in 1997,postgra-duate,is a student member of China Computer Federation.Her main research interests include knowledge graph and graph database.
    E Hai-hong,born in 1982,Ph.D,asso-ciate professor,is a member of China Computer Federation.Her main research interests include big data platform,cloud computing and microservice architecture.
  • Supported by:
    National Key R&D Program of China(2018YFB1403501).

摘要: 随着大数据时代的到来,各个行业领域需要处理的数据之间的关系数量呈几何级数增长,亟需一种支持海量复杂数据关系表示能力的数据模型,即领域知识图谱。虽然领域知识图谱展现了巨大的潜力,但不难发现目前仍然缺乏成熟的构建技术和平台。如何快速构建出领域知识图谱是一个重要挑战。在对领域知识图谱进行系统的研究后,提出了一种基于属性图模型的领域知识图谱构建方法。该方法对于存储在多种原始业务数据库中的结构化、半结构化数据,通过约定图数据库的数据对接协议、多种图实体模式和关系模式配置方案等方式,完成对应的高质量完整的图谱模式构建;然后将原始数据库的实例数据经过抽取、转换后加载到属性图数据库HugeGraph中,完成领域知识图谱的构建。最终,通过对多个数据集进行实验,并使用Gremlin语句对知识图谱数据进行测试,验证了所提方法具有完整性和可靠性。

关键词: HugeGraph, 领域知识图谱, 图数据库, 知识图谱构建, 属性图模型

Abstract: With the arrival of the big data era,the relationship that needs to be processed in various industries has increased exponentially,and there is an urgent need for a data model that supports the ability to express massive complex relationship,that is,domain knowledge graph.Although the domain knowledge graph has shown great potential,it is not difficult to find that there is still a lack of mature construction technologies and platforms.It still remains an important challenge to construct domain know-ledge graph rapidly.After the systematic study of domain knowledge graph,a method is proposed to construct domain knowledge graph based on property graph model.Concretely,for structured and semi-structured data stored in a variety of databases,the method completes the construction of the high-quality graph model by graph database data communication protocol,multiple configuration methods of entity and relation schema,etc.Then,the data from the original database is extracted,transformed and loa-ded into the property graph database HugeGraph,completing the construction of domain knowledge graph.Finally,experiments on multiple datasets and test results of Gremlin statement show that the proposed method is complete and reliable.

Key words: Domain knowledge graph, Graph database, HugeGraph, Knowledge graph construction, Property graph model

中图分类号: 

  • TP392
[1]SINGHAL A.Introducing the knowledge graph:things,notstrings[J].Official Google Blog,2012,5:16.
[2]HANG T T,FENG J,LU J M.Knowledge Graph Construction Techniques:Taxonomy,Survey and Future Directions[J].Computer Science,2021,48(2):175-189.
[3]ROBINSON I,WEBBER J,EIFREM E.Graph Databases[M].Cambridge:O'Reilly Media,Inc,2015:12-20.
[4]LI J Y.Graph database white paper and basic functional stan-dards[R].Beijing:China Academy of Information and Communications Technology,2019.
[5]NEO4J STAFF.The Database Model Showdown:An RDBMS vs.Graph Comparison[EB/OL].(2015-08-03)[2021-05-01].https://neo4j.com/blog/database-model-comparison.
[6]OZGUR C,COTO J,BOOTH D.A comparative study of net-work modeling using a relational database (eg Oracle,mySQL,SQL server) vs.Neo4j[J].International Journal of Engineering Research,2018,8(7):27-32.
[7]MAGORZATA P W,RYKOWSKI D.Comparison of Relatio-nal,Document and Graph Databases in the Context of the Web Application Development[M].Swiss:Springer International Publishing,2016:3-13.
[8]FOSIC I,SOLIC K.Graph database approach for data storing,presentation and manipulation[C]//2019 42nd International Convention on Information and Communication Technology,Electronics and Microelectronics.Opatija:IEEE Press,2019:1548-1552.
[9]SHOLICHAH R J,IMRONA M,ALAMSYAH A.Performance Analysis of Neo4j and MySQL Databases using Public Policies Decision Making Data[C]//2020 7th International Conference on Information Technology,Computer,and Electrical Enginee-ring (ICITACEE).Semarang:IEEE Press,2020:152-157.
[10]BATRA S,CHARU T.Comparative analysis of relational and graph databases[J].International Journal of Soft Computing and Engineering (IJSCE),2012,2(2):509-512.
[11]The Neo4j Team.The Neo4j Manual v3.4[EB/OL].(2018-05-16)[2021-05-01].https://neo4j.com/docs/developer-manual/current.
[12]MUELLER W,IDZIASZEK P.Mapping and visualization ofcomplex relational structures in the graph form using the Neo4j graph database[C]//Proceedings of Eleventh InternationalConference on Digital Image Processing.Guangzhou,2019:456-462.
[13]UNAL Y,OGUZTUZUN H.Migration of data from relational database to graph database[C]//Proceedings of the 8th International Conference on Information Systems and Technologies.New York:Association for Computing Machinery,2018:1-5.
[14]VIRGILIO R D,MACCIONI A,TORLONE R.Converting relational to graph databases[C]//First International Workshop on Graph Data Management Experiences and Systems (GRADES'13).New York:Association for Computing Machinery,2013:1-6.
[15]VIRGILIO R D,MACCIONI A,TORLONE R.R2G:a Tool for Migrating Relations to Graphs[C]//Proceedings of Internatio-nal Conference on Extending Database Technology.Athens:OpenProceedings,2014:640-643.
[16]ANZUM N.Systems for Graph Extraction from Tabular Data[D].Waterloo:University of Waterloo,2020.
[17]SERIN F,METE S,GUL M,et al.Mapping Between Relational Database Management Systems and Graph Database For Public Transportation Network[C]//21st International Research/Expert Conference “Trends in the Development of Machinery and Associated Technology”.Karlovy Vary,2018:209-212.
[18]THUTMOSE.「JanusGraph与HugeGraph」图形数据库-技术选型-功能对比[EB/OL].(2019-03-25)[2021-05-01].https://blog.csdn.net/lovebyz/article/details/88800363.
[19]LIU H Y.QABasedOnMedicaKnowledgeGraph[EB/OL].(2018-10-04)[2021-05-01].https://github.com/liuhuanyong/QASystemOnMedicalKG.
[20]XU C,DALE C,LIU J.Statistics and Social Network of YouTube Videos[C]//16th International Workshop on Quality of Service.Enskede:IEEE Press.2008:229-238.
[1] 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓.
一种可快速迁移的领域知识图谱构建方法
Fast and Transmissible Domain Knowledge Graph Construction Method
计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[2] 黄梅根, 刘川, 杜欢, 刘佳乐.
基于知识图谱的认知诊断模型及其在教辅中的应用研究
Research on Cognitive Diagnosis Model Based on Knowledge Graph and Its Application in Teaching Assistant
计算机科学, 2021, 48(6A): 644-648. https://doi.org/10.11896/jsjkx.200700163
[3] 鄂海红, 韩鹏昊, 宋美娜.
关系型数据库向图数据库的转换方法
Conversion Method from Relational Database to Graph Database
计算机科学, 2021, 48(10): 140-144. https://doi.org/10.11896/jsjkx.201100073
[4] 赵萍, 寿黎但, 陈珂, 陈刚, 吴晓凡.
面向局域检索的时变图数据存储与查询模型
Storage and Query Model for Localized Search on Temporal Graph Data
计算机科学, 2019, 46(10): 186-194. https://doi.org/10.11896/jsjkx.19100530C
[5] 姜人和,郑晓梅,朱晓倩,潘敏学,张天.
一种基于UML关系的Java代码库构造方法
Method of Java Code Repository Construction Based on UML Relationship
计算机科学, 2017, 44(11): 69-79. https://doi.org/10.11896/j.issn.1002-137X.2017.11.011
[6] 詹舒波 张其善.
电子地图数据库存贮文件的设计

计算机科学, 1996, 23(3): 56-59.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!