计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 167-176.doi: 10.11896/jsjkx.200900114

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于邻域结构的时态RDF模型及索引方法

陈圆圆, 严丽, 章哲庆, 马宗民   

  1. 南京航空航天大学计算机科学与技术学院/人工智能学院 南京211106
  • 收稿日期:2020-09-14 修回日期:2020-12-31 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 马宗民(zongminma@nuaa.edu.cn)
  • 作者简介:chenyuanyuan@nuaa.edu.cn

Temporal RDF Model and Index Method Based on Neighborhood Structure

CHEN Yuan-yuan, YAN Li, ZHANG Zhe-qing, MA Zong-min   

  1. College of Computer Science and Technology/College of Artificial Intelligence,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2020-09-14 Revised:2020-12-31 Online:2021-10-15 Published:2021-10-18
  • About author:CHEN Yuan-yuan,born in 1996,postgraduate.Her main research interests include RDF data and the semantic web.
    MA Zong-min,born in 1965,Ph.D,professor.His main research interests include databa-ses,the semantic web,knowledge representation andreaso-ning,information uncertainty.

摘要: 资源描述框架(Resource Description Framework,RDF)是W3C推荐的一种元数据模型和信息描述规范,已被广泛地应用于各个领域。为了跟踪RDF数据随时间的变化,将时态信息引入RDF的框架中,随着时态RDF数据的快速增长,对时态RDF数据的有效管理变得十分必要,构建合理的索引机制能够实现对数据的高效存储和查询。文中提出了一种时态RDF数据模型,给出了具体的一维编码方案,实现了简单地表示时态信息,并以较低的开销扩展现有的RDF数据模型。在此基础上,提出了基于邻域的二级索引结构。首先利用动态计数过滤器的方法索引的邻域信息,然后利用B+树索引每个结点相关的全部时态RDF数据,同时,可对大规模时态RDF数据进行更新。实验结果表明,所提方法相比对比方法在大多数情况下性能提高了35%左右,具有可扩展性和有效性。

关键词: RDF, 编码, 动态计数过滤器, 时态RDF, 索引结构

Abstract: Resource description framework (RDF) is a metadata model and information description specification recommended by W3C,which is widely used in various fields.To track changes in RDF data over time,temporal information is introduced into the RDF framework.With the rapid growth of temporal RDF data,effective management of temporal RDF data is necessary.A reasonable index mechanism can achieve efficient storage and query of data.In this paper,we first present a temporal RDF data mo-del.We propose a specific one-dimensional coding scheme,which represent temporal data simply and extend the existing RDF data model with lower overhead.Furthermore,we present its two levels of indexes based on neighborhood structure.The first one uses dynamic counting filter to index the neighborhood information of the node,and the second builds the B+ tree to index the temporal RDF data related to each node.Moreover,large-scale temporal RDF data can be updated.Experimental results show that the proposed method is around 35% better than the comparison method in most cases,and it is scalable and effective.

Key words: Dynamic counting filter, Encoding, Index structure, RDF, Temporal RDF

中图分类号: 

  • TP399
[1]BERNERS-LEE T,HENDLER J,LASSILA O.The semanticWeb[J].Scientific American,2001,284(5):28-37.
[2]World Wide Web Consortium:RDF/XML Syntax Specification (Revised)[EB/OL].[2020-05-30].http://www.w3.org/TR/2004/REC-rdf-syntax-grammer.
[3]AUER S,BIZER C,KOBILAROV G,et al.DBpedia:A nucleus for a Web of open data[J].The Semantic Web,2007,4825:722-735.
[4]WISHART D S,KNOX C,GUO A C,et al.DrugBank:a comprehensive resource for in silico drug discovery and exploration[J].Nucleic Acids Research,2006,34:668-672.
[5]WICK M.GeoNames [EB/OL].[2020-05-30].https://www.geonames.org/.
[6]Technology Transformation Service.Data.gov [EB/OL].[2020-05-30].https://www.data.gov/.
[7]W3C SWEO Community Project.Linking open data on the semantic Web [EB/OL].[2020-05-30].https://lod-cloud.net/.
[8]CARROLL J,DICKINSON I,DOLLIN C,et al.Jena:Implementing the semantic Web recommendations[C]//Proceeding of the World Wide Web Conference on Alternate Track Papers and Posters.New York:ACM,2004:74-83.
[9]NEUMANN T,WEIKUM G.The RDF-3X engine for scalable management of RDF data[J].The VLDB Journal,2010,19(1):91-113.
[10]MADDURI K,WU K.Massive-scale RDF processing usingcompressed bitmap indexes[C]//LNCS 6809:Statistical and Scientific Database Management.Berlin:Springer,2011:470-479.
[11]JANIK M,KOCHUT K.BRAHMS:A WorkBench RDF store and high performance memory system for semantic association discovery[C]//Proceeding of the International Semantic Web Conference.Berlin:Springer,2005:431-445.
[12]UDREA O,PUGLIESE A,SUBRAHMANIAN V.Grin:Agraph based RDF index[C]//Proceeding of the National Confe-rence on Artificial Intelligence.Palo Alto:AAAI Press,2007:1465-1470.
[13]KIM K,MOON B,KIM H J.RG-index:An RDF graph index for efficient SPARQL query processing[J].Expert Systems with Applications,2014,41(10):4596-4607.
[14]JENSEN C.The consensus glossary of temporal database concepts-February 1998 version[C]//LNCS 1399:Temporal Databases:Research and Practice.Berlin:Springer,1997:367-405.
[15]KRISHNA G,MICHELS J.Temporal features in SQL:2011[J].SIGMOD Record,2012,41(3):34-43.
[16]ZAIDI A K.A temporal programmer for time-sensitive modeling of discrete event systems[C]//Systems Man and Cybernetics.Piscataway,NJ:IEEE,2000:2186-2191.
[17]PEUQUET D.Making space for time:Issues in space-time data representation[J].GeoInformatica,2001,5:11-32.
[18]NORVAG K,NYBO A O.DyST:Dynamic and Scalable Temporal Text Indexing[C]//International Symposium on Temporal Representation and Reasoning.Piscataway,NJ:IEEE,2006:204-211.
[19]CLAUDIO G,CARLOS A H,ALEJANDRO A.Temporal RDF[C]//LNCS 3532:The semantic Web:Research and Applications.Berlin:Springer,2005:93-107.
[20]GUTIERREZ C,HUIRADO C A,VAISMAN A.Introducing time into RDF[J].IEEE Transactions on Knowledge and Data Engineering,2006,19(2):207-218.
[21]GRANDI F.Multi-temporal RDF ontology versioning[J].CEUR Workshop Proceedings,2009,519:1-10.
[22]ZHANG F,WANG K,LI Z Y,et al.Temporal data representation and querying based on RDF[J].IEEE Access,2019,7:85000-85023.
[23]MOTIK B.Representing and querying validity time in RDF and OWL:A logic-based approach[J].Journal of Web Semantics,2012,12:3-21.
[24]PUGLIESE A,UDREA O,SUBRAHMANIAN V S.ScalingRDF with time[C]//Proceeding of the 17th Int Conference on World Wide Web.New York:ACM,2008:605-614.
[25]TAPPOLET J,BERNSTEIN A.Applied temporal RDF-efficient temporal querying of RDF data with SPARQL[C]//LNCS 5554:The Semantic Web:Research and Applications.Berlin:Springer,2009:308-322.
[26]YAN L,ZHAO P,MA Z M.Indexing temporal RDF graph[J].Computing,2019,101(10):1457-1488.
[27]ZHAO P,YAN L.A methodology for indexing temporal RDF data[J].Journal of Information Science and Engineering,2019,35(4):923-934.
[28]WANG Y F,ZHU M J,QU L Z,et al.Timely YAGO:Harvesting,querying,and visualizing temporal knowledge from Wikipedia[C]//Proceeding of the 13th International Conference on Extending Database Technology.New York:ACM,2010:697-700.
[29]KOUBARAKIS M,KYZIRAKOS K.Modeling and queryingmetadata in the semantic sensor Web:The model stRDF and the query language stSPARQL[C]//LNCS 6088:Proceeding of the 7th Extended Semantic Web Conference.Berlin:Springer,2010:425-439.
[30]ZIMMERMANN A,LOPES N,POLLERES A,et al.A general framework for representing,reasoning and querying with annotated semantic Web data[J].Journal of Web Semantics,2012,11:72-95.
[31]LIAGOURIS J,MAMOULIS N,BOUROS P,et al.An effective encoding scheme for spatial RDF data[J].Proceedings of the VLDB Endowment,2014,7(12):1271-1282.
[32]VLACHOU A,DOULKERIDIS C,GLENIS A,et al.Efficientspatio-temporal RDF query processing in large dynamic know-ledge bases[C]//ACM Symp on Applied Computing.New York:ACM,2019:439-447.
[33]CYGANIAK R,HARTH A,HOGAN A.N-Quads:Enxtending N-Triples with Context [EB/OL].[2020-05-30].http://sw.deri.org/2008/07/n-quads/.
[34]TIAN Y Y,JIGNESH M.TALE:A tool for approximate large graph matching[C]//Proceeding of the 2008 IEEE 24th International Conference on Data Engineering.Los Alamitos,CA:IEEE Computer Society,2008:963-972.
[35]KHAN A,LI N,YAN X,et al.Neighborhood based fast graph search in large networks[C]//Proceeding of the 2011 ACM SIGMOD International Conference on Management of Data.New York:ACM,2011:901-912.
[36]ALUÇ G,HARTIG O,ÖZSU T,et al.Diversified stress testing of RDF data management systems[C]//LNCS 8796:Int. Semantic Web Conf.Berlin:Springer,2014:197-212.
[37]AGUILAR-SABORIT J,TRANCOSO P,MUNTES-MULERO V,et al.Dynamic count filters[J].ACM SIGMOD Record,2006,35(1):26-32.
[1] 刘鑫, 王珺, 宋巧凤, 刘家豪.
一种基于AAE的协同多播主动缓存方案
Collaborative Multicast Proactive Caching Scheme Based on AAE
计算机科学, 2022, 49(9): 260-267. https://doi.org/10.11896/jsjkx.210800019
[2] 王冠宇, 钟婷, 冯宇, 周帆.
基于矢量量化编码的协同过滤推荐方法
Collaborative Filtering Recommendation Method Based on Vector Quantization Coding
计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109
[3] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[4] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[5] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[6] 刘月红, 牛少华, 神显豪.
基于卷积神经网络的虚拟现实视频帧内预测编码
Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network
计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[7] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[8] 郁舒昊, 周辉, 叶春杨, 王太正.
SDFA:基于多特征融合的船舶轨迹聚类方法研究
SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion
计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253
[9] 陈章辉, 熊贇.
基于解耦-检索-生成的图像风格化描述生成模型
Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate
计算机科学, 2022, 49(6): 180-186. https://doi.org/10.11896/jsjkx.211100129
[10] 杨桃雨, 徐媛媛, 谭增洁.
面向6G的全景视频片划分优化编码算法
Tile Partition Optimized Omnidirectional Video Coding for 6G Network
计算机科学, 2022, 49(6): 66-72. https://doi.org/10.11896/jsjkx.220400034
[11] 冯雁, 王蕊聪.
基于量子傅里叶变换求和的量子投票协议
Quantum Voting Protocol Based on Quantum Fourier Transform Summation
计算机科学, 2022, 49(5): 311-317. https://doi.org/10.11896/jsjkx.210300058
[12] 蒋锐, 徐姗姗, 徐友云.
一种新的基于子连接结构的混合预编码算法
New Hybrid Precoding Algorithm Based on Sub-connected Structure
计算机科学, 2022, 49(5): 256-261. https://doi.org/10.11896/jsjkx.210300138
[13] 韩洁, 陈俊芬, 李艳, 湛泽聪.
基于自注意力的自监督深度聚类算法
Self-supervised Deep Clustering Algorithm Based on Self-attention
计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001
[14] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[15] 瞿中, 陈雯.
基于空洞卷积和多特征融合的混凝土路面裂缝检测
Concrete Pavement Crack Detection Based on Dilated Convolution and Multi-features Fusion
计算机科学, 2022, 49(3): 192-196. https://doi.org/10.11896/jsjkx.210100164
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!