计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 18-24.doi: 10.11896/jsjkx.220500205

• 数据库&大数据&数据科学 • 上一篇    下一篇

一种增量式本体模型与数据模式映射的图谱实例模型构建演化方法

单中原1,2, 杨恺1,2, 赵俊峰1,2,3, 王亚沙1,2,3, 徐涌鑫1,2   

  1. 1 北京大学计算机学院 北京100871
    2 高可信软件技术教育部重点实验室 北京100871
    3 北京大学(天津滨海)新一代信息技术研究院 天津300450
  • 收稿日期:2021-10-22 修回日期:2022-05-16 出版日期:2023-01-15 发布日期:2023-01-09
  • 通讯作者: 赵俊峰(zhaojf@pku.edu.cn)
  • 作者简介:1901213329@pku.edu.cn
  • 基金资助:
    国家自然科学基金(62172011)

Ontology-Schema Mapping Based Incremental Entity Model Construction and Evolution Approach of Knowledge Graph

SHAN Zhongyuan1,2, YANG Kai1,2, ZHAO Junfeng1,2,3, WANG Yasha1,2,3, XU Yongxin1,2   

  1. 1 School of Computer Science,Peking University,Beijing 100871,China
    2 Key Laboratory of High Confidence Software Technologies,Ministry of Education,Beijing 100871,China
    3 Peking University Information Technology Institute(Binhai,Tianjin),Tianjin 300450,China
  • Received:2021-10-22 Revised:2022-05-16 Online:2023-01-15 Published:2023-01-09
  • About author:SHAN Zhongyuan,born in 1997,postgraduate.His main research interests include knowledge graph and so on.
    ZHAO Junfeng,born in 1974,Ph.D,research professor,is a member of China Computer Federation.Her main research interests include big data analysis,knowledge graph,urban computing and so on.
  • Supported by:
    National Natural Science Foundation of China(62172011).

摘要: 在智慧城市领域中,随着信息化技术的不断深入,各信息系统产生的海量数据不断增长,这些多源异构数据之间的语义互通成为了城市智能应用开发需要解决的重要问题之一。构建知识图谱是解决数据语义互通的常用手段之一。在建立知识图谱本体模型后,图谱实例模型的构建演化就成为支撑基于图谱的各类应用的关键技术。为此,如何将不断更新的数据源中的知识实例尽可能自动化地扩充到知识图谱中,成为了图谱构建的首要问题。现有的一些知识实例生成工具对数据导入的支持力度不足,用户需要对源数据进行复杂的预处理,将其转化为符合平台支持的导入数据格式。这导致预处理工作量大,且不能迅速地应对数据不断更新增长的情况。由于智慧城市领域中信息系统所产生的数据多为结构化或半结构化数据,文中提出一种增量式本体模型与数据模式映射的图谱实例模型构建演化方法,面向结构化或半结构化数据生成实例,并随着数据的更新,实现图谱实例模型的增长与演化。文中方法结合机器推荐与人机协同交互设计,针对不同数据源的特征抽取知识并将其正确地映射到本体模型中的概念实体上,实现领域知识图谱实例模型的增量扩充;并通过实体对齐、关系补全等方法,支持实例模型的持续演化。文中方法在企业信息领域知识图谱的构建场景中得到了验证,通过机器推荐和不去重,实现了实例高效且准确的生成,其有效性也得到了证实。

关键词: 知识图谱, 本体模型, 数据模式, 人机交互

Abstract: In the field of smart city,with the deepening of information technology,many systems generate massive data.Semantic communication among these multi-source heterogeneous data has become one of the important problems to be solved in the deve-lopment of urban intelligent applications.Building knowledge graph is one of the common means to solve the semantic communication of data.After establishing ontology,the construction and evolution of graph entity model becomes the key technology to support various applications.Therefore,how to automatically extend the knowledge entities from constantly updated data sources becomes the primary problem of knowledge graph construction.Some existing knowledge entity generation tools cannot provide sufficient support for data import,and users need to carry out complex preprocessing of source data to convert it into the data format supported by the platform.As a result,the workload of preprocessing is heavy,and the data cannot be updated and increased rapidly.To deal with structured or semi-structured data,this paper proposes an ontology schema mapping-based incremental entity model construction and evolution approach of knowledge graph,which achieves the growth and evolution of instance model as data update.Based on the combination of machine recommendation and human-machine interaction,according to the characteristics of different data sources,the knowledge is extracted and correctly mapped to the concepts in the ontology model.The conti-nuous evolution of the entity model is supported by means of entity alignment and relationship complement.The approach is verified in the knowledge graph construction scenario of enterprise domain.By machine recommendation and prohibiting duplicate checking,efficient and accurate entity generation is realized,which proves the effectiveness of the approach.

Key words: Knowledge graph, Ontology, Schema, Human-machine interaction

中图分类号: 

  • TP311
[1]MADHAVAN J,BERNSTEIN P A,RAHM E.Generic schema matching with cupid[C]//Proc.of the Int'l Conf.on Very Large Data Bases.Morgan Kaufmann Publishers Inc,2001:49-58.
[2]RAHM E,BRENSTEIN P A.A survey of approaches to automatic schema matching[J].The VLDB Journal,2001,10(4):334-350.
[3]BERNSTEIN P A,MADHAVAN J,RAHM E.Generic schemRONa matching,ten years later[J].Proc.of the VLDB Endowment,2011,4(11):695-701.
[4]JIMÉNEZ-RUIZ E,KHARLAMOV E,ZHELEZNYAKOV D,et al.BootOX:Practical mapping of RDBs to OWL 2[C]//Proc.of the Int'l Semantic Web Conf.Springer Int'l Publishing,2015.
[5]SANTOSO H A,HAW S C,ABDUL-MEHDI Z T.Ontology extraction from relational database:Concept hierarchy as background knowledge[J].Knowledge-Based Systems,2011,24(3):457-464.
[6]ARENAS M,BERTAILS A,PRUD' HOMMEAUX E,et al.A direct mapping of relational data to RDF[J].W3C Recommendation,2012,27:1-11.
[7]MASSMANN S,RAUNICH S,AUMÜLLER D,et al.Evolution of the COMA match system[C]//Proceedings of the 6th International Conference on Ontology Matching-Volume 814.CEUR-WS.org,2011:49-60.
[8]SARASUA C,SIMPERL E,NOY N F.Crowdmap:Crowdsour-cing ontology alignment with microtasks[C]//International Semantic Web Conference.Berlin:Springer,2012:525-541.
[9]HUNG N Q V,TAM N T,MIKLÓS Z,et al.On leveraging crowdsourcing techniques for schema matching networks[C]//International Conference on Database Systems for Advanced Applications.Berlin:Springer,2013:139-154.
[1] 荣欢, 钱敏峰, 马廷淮, 孙圣杰.
基于先验知识图谱的多代理被遮挡目标类别推理模型
Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration
计算机科学, 2023, 50(1): 243-252. https://doi.org/10.11896/jsjkx.220700112
[2] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[3] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[4] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[5] 孔世明, 冯永, 张嘉云.
融合知识图谱的多层次传承影响力计算与泛化研究
Multi-level Inheritance Influence Calculation and Generalization Based on Knowledge Graph
计算机科学, 2022, 49(9): 221-227. https://doi.org/10.11896/jsjkx.210700144
[6] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[7] 王杰, 李晓楠, 李冠宇.
基于自适应注意力机制的知识图谱补全算法
Adaptive Attention-based Knowledge Graph Completion
计算机科学, 2022, 49(7): 204-211. https://doi.org/10.11896/jsjkx.210400129
[8] 马瑞新, 李泽阳, 陈志奎, 赵亮.
知识图谱推理研究综述
Review of Reasoning on Knowledge Graph
计算机科学, 2022, 49(6A): 74-85. https://doi.org/10.11896/jsjkx.210100122
[9] 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓.
一种可快速迁移的领域知识图谱构建方法
Fast and Transmissible Domain Knowledge Graph Construction Method
计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[10] 杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥.
军事指控保障领域命名实体识别语料库的构建
Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support
计算机科学, 2022, 49(6A): 133-139. https://doi.org/10.11896/jsjkx.210400132
[11] 李荪, 曹峰.
智能语音技术端到端框架模型分析和趋势研究
Analysis and Trend Research of End-to-End Framework Model of Intelligent Speech Technology
计算机科学, 2022, 49(6A): 331-336. https://doi.org/10.11896/jsjkx.210500180
[12] 熊中敏, 舒贵文, 郭怀宇.
融合用户偏好的图神经网络推荐模型
Graph Neural Network Recommendation Model Integrating User Preferences
计算机科学, 2022, 49(6): 165-171. https://doi.org/10.11896/jsjkx.210400276
[13] 钟将, 尹红, 张剑.
基于学术知识图谱的辅助创新技术研究
Academic Knowledge Graph-based Research for Auxiliary Innovation Technology
计算机科学, 2022, 49(5): 194-199. https://doi.org/10.11896/jsjkx.210400195
[14] 朱敏, 梁朝晖, 姚林, 王翔坤, 曹梦琦.
学术引用信息可视化方法综述
Survey of Visualization Methods on Academic Citation Information
计算机科学, 2022, 49(4): 88-99. https://doi.org/10.11896/jsjkx.210300219
[15] 张继凯, 李琦, 王月明, 吕晓琪.
基于单目RGB图像的三维手势跟踪算法综述
Survey of 3D Gesture Tracking Algorithms Based on Monocular RGB Images
计算机科学, 2022, 49(4): 174-187. https://doi.org/10.11896/jsjkx.210700084
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!