计算机科学 ›› 2023, Vol. 50 ›› Issue (6): 142-150.doi: 10.11896/jsjkx.230300071
张雅晴1,2, 单中原1,2, 赵俊峰1,2,3, 王亚沙1,2,3
ZHANG Yaqing1,2, SHAN Zhongyuan1,2, ZHAO Junfeng1,2,3, WANG Yasha1,2,3
摘要: 随着大数据技术的深入发展,各领域产生了海量异构数据,构建知识图谱是实现异构数据语义互通的重要手段。通过将结构化数据与本体模型映射匹配来生成实例模型是图谱实例层构建常用的方法。然而,对于复杂异构的领域数据来说,现有映射式实例构建方法大多需要用户手动完成全部映射匹配,映射操作繁琐,无法进行智能匹配,费时费力且容易出错。除此之外,现有方法对实例导入后的增量更新也支持不足。针对现有模式匹配和实例构建方法的映射操作繁琐的问题,提出了基于智能映射推荐的实例构建与演化方法。其中,智能映射复用推荐机制,在用户手动映射之前进行数据模式匹配计算,对元素级相似度、表级相似度和表间传播相似度进行多级相似度综合计算,根据数据模式匹配度仲裁排序后生成推荐映射。另外,增量发现机制通过自动发现冗余实例和冲突实例,生成系统后台任务进行处理,可实现实例的高效无重复导入。在山东市政府开放数据集和深圳市医疗急救数据集上进行了实验,在映射复用推荐模块的辅助下,交互时间缩短为传统模式的约26%,字段推荐匹配准确率达到98.1%;在增量发现模块的实验中,导入了1 394万个实例节点以及2 158万条关系边所需的时间由31.21 h缩短至2.23 h,验证了智能映射复用推荐的可用性和匹配准确率,提高了实例层构建与演化的效率。
中图分类号:
[1]SINGHAL A.Introducing the knowledgegraph:things,notstrings[Z/OL].Official google blog.2012 http://googleblog.blogspot.pt/2012/05/introducing-knowledge-graph-things-not.html. [2]RAHM E,BERNSTEIN P A.A survey of approaches to automatic schema matching[J].the VLDB Journal,2001,10(4):334-350. [3]MASSMANN S,RAUNICH S,AUMÜLLER D,et al.Evolution of the COMA match system[J].Ontology Matching,2011,49:49-60. [4]SHVAIKO P,EUZENAT J.Ontology matching:state of the art and future challenges[J].IEEE Transactions on Knowledge and Data Engineering,2011,25(1):158-76. [5]ARENAS M,BERTAILS A,PRUD'HOMMEAUX E,et al.A direct mapping of relational data to RDF[J].W3C recommendation,2012,27:1-11. [6]SOURIPRIYA DAS S S,RICHARD CYGANIAK.R2RML:RDB to RDF Mapping Language [OL].https://wwww3org/TR/r2rml/. [7]BERNSTEIN P A,MADHAVAN J,RAHM E.Generic schema matching,ten years later[C]//Proceedings of the VLDB Endowment.2011:695-701. [8]PAPAPANAGIOTOU P,KATSIOULI P,TSETSOS V,et al.RONTO:Relationalto ontology schema matching[J].AIS Sigsemis Bulletin,2006,3(3/4):32-36. [9] WANG F,WANG Y S,ZHAO J F,et al.A Schema Matching Method from relational model to ontology Model Based on Iteration[J].Journal of Software,2019,30(5):1510-1521. [10]SEQUEDA J F,MIRANKER D P.Ultrawrap Mapper:A Semi-Automatic Relational Database to RDF(RDB2RDF) Mapping Tool[C]//Proceedings of the ISWC(Posters & Demos).2015. [11]Pentaho Data Integration-Pentaho Documentation [OL].https://helphitachivantaracom/Documentation/Pentaho/93. [12]ArcGIS [OL].https://developersarcgiscom/. [13]PKUMOD.gBuilder [OL].http://wwwopenkgcn/tool/gbuilder/. [14]Spring Boot [OL].https://springio/projects/spring-boot. [15]MELNIK S,GARCIA-MOLINA H,RAHM E.Similarity floo-ding:A versatile graph matching algorithm and its application to schema matching[C]//Proceedings 18th International Confe-rence on Data Engineering.IEEE,2002:117-128. |
|