计算机科学 ›› 2021, Vol. 48 ›› Issue (4): 63-69.doi: 10.11896/jsjkx.200600084

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于双时态RDF模型的索引方法

王引娣, 章哲庆, 严丽   

  1. 南京航空航天大学计算机科学与技术学院 南京210000
  • 收稿日期:2020-06-24 修回日期:2020-08-11 出版日期:2021-04-15 发布日期:2021-04-09
  • 通讯作者: 严丽(yanli@nuaa.edu.cn)
  • 基金资助:
    江苏省自然科学基金(BK20191274);国家自然科学基金(61772269)

Indexing Bi-temporal RDF Model

WANG Yin-di, ZHANG Zhe-qing, YAN Li   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210000,China
  • Received:2020-06-24 Revised:2020-08-11 Online:2021-04-15 Published:2021-04-09
  • About author:WANG Yin-di,born in 1996,postgra-duate.Her main research interests include RDF data and semantic web.(2515551281@qq.com)
    YAN Li,born in 1964,professor.Her main research interests include big data,knowledge graph,spatiotemporal information processing and NoSQL database.
  • Supported by:
    Natural Science Foundation of Jiangsu Province(BK20191274) and National Natural Science Foundation of China(61772269).

摘要: RDF(Resource Description Framework)已被广泛用于大数据的语义表示与处理。传统的RDF只能表示静态语义,无法满足时间敏感场景下随时间动态处理语义的需求。为此,几种时态RDF模型已被提出,包括支持事务时间或有效时间的时态RDF模型,以及同时支持事务时间和有效时间的双时态RDF模型。为有效支持大规模时态RDF的高效处理,文中提出了一种基于双时态模型的时态RDF三层索引结构。第一层根据最大更新次数将双时态RDF数据划分为不同的数据子集;第二层在每一个数据子集上分别建立一棵四叉树来索引时间信息;第三层构建了包含3种组合键的复合位图来索引RDF三元组的主体、谓词和客体信息。实验从索引构建时间、索引占用空间,以及查询所需时间3个方面对所提时态RDF索引结构进行验证,结果表明,所提索引方案能有效缩短查询时间并提高查询效率。

关键词: RDF, 三层索引, 时态信息, 四叉树, 位图索引

Abstract: RDF(Resource Description Framework) has been widely used for semantic representation and processing of big data.Traditional RDF can only represent static semantics and can not meet the needs of processing semantics dynamically over time in time-sensitive scenarios.Therefore,many temporal RDF models are proposed,including RDF model for transaction time,RDF model for valid time,and bi-temporal RDF model thatsupports both transaction time and valid time.To support efficient proces-sing of large-scale temporal RDF data,this paper proposes a three-level index structure based on bi-temporal RDF model.Specifi-cally,in the first level of this index structure,the dataset is divided into different subsets according to the update times of the temporal RDF data.In the second level,a quadtree is built for indexing time information in each subset,and in the third level,the bitmap with three composite keys is used to index the subject,predicate,object of RDF triples.Experiments are conducted from three aspects:the time of building index,the index size,and the required query time.Experimental results show that the proposed indexing scheme can reduce the query time effectively and improve the query performance.

Key words: Bitmap index, Quadtree, RDF, Temporal data, Three-level index

中图分类号: 

  • TP399
[1]AUER S.DBpedia:A Nucleus for a Web of Open Data.[C]//Semantic Web,International Semantic Web Conference,Asian Semantic Web Conference.Iswc+Aswc,Busan,Korea,DBLP,2007.
[2]IBM.IBM smart planet[EB/OL].http://www.ibm.com/developerworks/cn/web/wa-aj-smartweb/index.html.
[3]HOFFART J,SUCHANEK F M,BERBERICH K,et al.YA-GO2:A spatially and temporally enhanced knowledge base from Wikipedia [J].Artificial Intelligence,2013,194:28-61.
[4]MA Z M,CAPRETZ M A,YAN L,et al.Storing massive Resource Description Framework(RDF) data:a survey[J].Know-ledge Engineering Review,2016,31(4):391-413.
[5]ZHANG F.Research and Implementation of an Object-Oriented Temporal Database System[D].Beijing:Chinese Academy of Sciences,2000.
[6]EDELWEISS N,HUBLER P N,MORO M M,et al.A temporal database management system implemented on top of a conventional database[C]//Proceedings 20th International Conference of the Chilean Computer Science Society.IEEE,2002.
[7]TimeConsult.TimeDB[EB/OL].http://www.timeconsult.com/.
[8]KULKARNI K,MICHELS J E.Temporal features in SQL:2011[J].SIGMOD record,2012,41(3):34-43.
[9]ABITEBOU S.Querying Semi-Structured Data[C]//International Conference on Database Theory.Berlin,Heidelberg:Springer,1997.
[10]VAISMAN A.Temporal XML:Data Model,Query Languageand Implementation[J].VLDB Journal,2008,17(5):1179-1212.
[11]DYRESON C E.Observing transaction-time semantics withTTXPath[C]//International Conference on Web Information Systems Engineering.IEEE,2001.
[12]TANG N,TANG Y,CAI M M.Bitemporal Extension of XPath Data Model[J].Journal of Computer Research and Development,2006,43(z3):504-509.
[13]GRANDI F.Multi-temporal RDF ontology versioning[J].CEUR Workshop Proceedings,2009,519:1-10
[14]BERETA K , SMEROS P , KOUBARAKIS M.Representation and Querying of Valid Time of Triples in Linked Geospatial Data[C]//Extended Semantic Web Conference.Berlin,Heidelberg:Springer,2013.
[15]ZHANG F , WANG K , LI Z,et al.Temporal Data Representation and Querying Based on RDF[J].IEEE Access,2019(99):1-1.
[16]PUGLIESE A , UDREA O , SUBRAHMANIAN V S.Scaling RDF with time[C]//Proceedings of the 17th International Conference on World Wide Web(WWW 2008).Beijing,China:ACM,2008:21-25.
[17]YAN L, ZHAO P, MA Z.Indexing temporal RDF graph[J].Computing,2019,101(10):1457-1488.
[18]ZHAO P, YAN L I.A Methodology for Indexing Temporal RDF Data[J].Journal of Information ence and Engineering,2019,35(4):923-934.
[19]WEISS C, KARRAS P, BERNSTEIN A.Hexastore:sextuple indexing for semantic web data management[J].Proceedings of the VLDB Endowment,2008,1(1):1008-1019.
[20]NEUMANN T,WEIKUM G.RDF-3X:a RISC-style enginefNeumann T,Weikum G.RDF-3X:a RISC-style Engine for RDF[J].Proceedings of the VLDB Endowment,2008,1(1).
[21]MATONO A, PAHLEVI S M, KOJIMA I.RDFCube:A P2P-Based Three-Dimensional Index for Structural Joins on Distributed Triple Stores [J].Lecture Notes in Computer Science,2006,4125:323-330.
[22]MCBRIDE B, BUTLER M.Representing and Querying Historical Information in RDF with Application to E-Discovery[R].Hewlit Packard Laboratories Technical Report,2009.
[23]MOTIK B.Representing and querying validity time in RDF and OWL:A logic-based approach[J].Journal of Web Semantics,2012,12-13(2):3-21.
[24]OGNYANOV D , KIRYAKOV A.Tracking Changes in RDF(S) Repositories[C]//Knowledge Engineering and Knowledge Management.Ontologies and the Semantic Web,13th International Conference(EKAW 2002).Siguenza,Spain:Springer-Verlag,2002:1-4.
[25]GUTIERREZ C, HURTADO C A, VAISMAN A A,et al.Introducing Time into RDF[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(2):207-218.
[26]UDREA O , RECUPERO D R , SUBRAHMANIAN V S.Annotated RDF[C]//Proceedings of the 3rd European conference on The Semantic Web:Research and Applications.ACM,2006.
[27]WANG Y , ZHU M , QU L,et al.Timely YAGO:Harvesting,Querying,and Visualizing Temporal Knowledge from Wikipedia[C]//13th International Conference on Extending Database Technology(EDBT 2010).Lausanne,Switzerland:ACM,2010:22-26.
[28]GUO Y, PAN Z, HEFLIN J.An evaluation of knowledge base systems for large OWL datasets[J].Lecture Notes in Computer Science,2004,3298:274-288.
[1] 鲁佳文, 严丽.
对象关系数据库到RDF(S)的映射方法
Mapping Method from Object-relational Database to RDF(S)
计算机科学, 2021, 48(10): 145-151. https://doi.org/10.11896/jsjkx.200800006
[2] 陈圆圆, 严丽, 章哲庆, 马宗民.
基于邻域结构的时态RDF模型及索引方法
Temporal RDF Model and Index Method Based on Neighborhood Structure
计算机科学, 2021, 48(10): 167-176. https://doi.org/10.11896/jsjkx.200900114
[3] 卢海川, 符海东, 刘宇.
基于CAN的地理语义数据存储与检索机制
Geo-semantic Data Storage and Retrieval Mechanism Based on CAN
计算机科学, 2019, 46(2): 171-177. https://doi.org/10.11896/j.issn.1002-137X.2019.02.027
[4] 刘宇, 杨百龙, 赵文强, 袁志华.
基于自适应块参照值的密文域可逆信息隐藏
Adaptive Pixel Block Reference Value Based Reversible Data Hiding in Encrypted Domain
计算机科学, 2018, 45(8): 151-155. https://doi.org/10.11896/j.issn.1002-137X.2018.08.027
[5] 张真真,王建林.
结合第二代Bandelet变换分块的字典学习图像去噪算法
Dictionary Learning Image Denoising Algorithm Combining Second Generation Bandelet Transform Block
计算机科学, 2018, 45(7): 264-270. https://doi.org/10.11896/j.issn.1002-137X.2018.07.046
[6] 宫法明,李翛然.
基于Neo4j的海量石油领域本体数据存储研究
Research on Ontology Data Storage of Massive Oil Field Based on Neo4j
计算机科学, 2018, 45(6A): 549-554.
[7] 王振武,吕小华,韩晓辉.
基于四叉树分割的地形LOD技术综述
Survey of Terrain LOD Technology Based on Quadtree Segmentation
计算机科学, 2018, 45(4): 34-45. https://doi.org/10.11896/j.issn.1002-137X.2018.04.005
[8] 禹鑫燚, 詹益安, 朱峰, 欧林林.
一种基于四叉树的改进的ORB特征提取算法
Improved ORB Feature Extraction Algorithm Based on Quadtree Encoding
计算机科学, 2018, 45(11A): 222-225.
[9] 单朴芳,郑嘉利,岳世彬,杨子薇.
增强型四叉树RFID防碰撞算法
Enhanced Four-fork Tree RFID Anti-collision Algorithm
计算机科学, 2016, 43(Z11): 271-274. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.063
[10] 宋景琦,刘慧,张彩明.
基于自适应块聚类的医学图像超分辨重建
Medical Image Super Resolution Reconstruction Based on Adaptive Patch Clustering
计算机科学, 2016, 43(Z11): 210-214. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.048
[11] 袁柳,张龙波.
模式级链接关联数据集上的关联规则挖掘研究
Association Rules Mining on Schema-level Interconnected Associated Data
计算机科学, 2016, 43(9): 91-98. https://doi.org/10.11896/j.issn.1002-137X.2016.09.017
[12] 郑翠春,汪璟玢.
RDF数据分布式并行语义编码算法
Distributed Parallel Semantic Coding Algorithm for RDF Data
计算机科学, 2016, 43(9): 197-202. https://doi.org/10.11896/j.issn.1002-137X.2016.09.039
[13] 董书暕,汪璟玢,陈远.
HMSST+:基于分布式内存数据库的HMSST算法优化
HMSST+:HMSST Algorithm Optimization Based on Distributed Memory Database
计算机科学, 2016, 43(3): 220-224. https://doi.org/10.11896/j.issn.1002-137X.2016.03.040
[14] 郑志蕴,王振涛,张行进,王振飞.
基于二分图的RDF关键词扩展查询方法
Keyword Expansion Query Approach over RDF Data Based on Bipartite Graph
计算机科学, 2016, 43(11): 272-279. https://doi.org/10.11896/j.issn.1002-137X.2016.11.053
[15] 郑志蕴 刘 博 李 伦 王振飞.
基于关键词的RDF数据图查询模型研究
Research of Keyword Search Model over RDF Data Graph
计算机科学, 2015, 42(7): 234-239. https://doi.org/10.11896/j.issn.1002-137X.2015.07.050
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!