计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 197-202.doi: 10.11896/j.issn.1002-137X.2016.09.039

• 软件与数据库技术 • 上一篇    下一篇

RDF数据分布式并行语义编码算法

郑翠春,汪璟玢   

  1. 福州大学数学与计算机科学学院 福州350108,福州大学数学与计算机科学学院 福州350108
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家青年基金项目(61300104),福建省科技拥军基金项目(JG2014001),福建省自然科学基金项目(2012J01168),福州大学科技发展基金资助

Distributed Parallel Semantic Coding Algorithm for RDF Data

ZHENG Cui-chun and WANG Jing-bin   

  • Online:2018-12-01 Published:2018-12-01

摘要: 现有的RDF数据分布式并行压缩编码算法均未考虑结合本体文件,导致编码后的RDF数据没有表示任何语义信息,不利于分布式查询或推理。针对这些问题,提出SCOM(Semantic Coding with Ontology on MapReduce)算法在分布式MapReduce下完成RDF数据的语义并行编码。该算法首先结合RDF数据本体,构建类关系和属性关系模型;在三元组项分类与过滤之后,对三元组项进行编码并生成字典表,最终完成RDF数据带有语义信息且具有规律性的编码。此外,SCOM算法能够很容易地将编码后的RDF数据文件恢复为原始文件。实验表明,SCOM算法能够高效地实现大规模数据的分布式并行编码。

关键词: RDF,本体,语义编码,MapReduce

Abstract: The existing distributed parallel compression coding algorithms for RDF data do not consider combining with the ontology file,resulting in encoded RDF data without any semantic information,which is not conducive to the distribu-ted query or reasoning. To solve these problems,a method named SCOM (Semantic Code with Ontology on MapReduce) was proposed to complete the semantic parallel coding for RDF data.Firstly,the algorithm combines the ontology of RDF data to build the class and attribute relationship model.The triple items are encoded and a dictionary table is generated after classifying and filtering triples.Finally,the coding for RDF data with semantic information and regularities is completed.In addition,SCOM algorithm can easily revert the encoded RDF data file to their original file.Experimental results show that SCOM algorithm can achieve the parallel coding of large-scale data efficiently.

Key words: RDF,Ontology,Semantic coding,MapReduce

[1] Du Fang,Chen Yue-guo,Du Xiao-yong.RDF Query Processing Techniques[J].Journal of Software,2013,24(6):1222-1242(in Chinese)杜方,陈跃国,杜小勇.RDF数据查询处理技术综述[J].软件学报,2013,24(6):1222-1242
[2] Auer S,Bizer C,Kobilarov G,et al.Dbpedia:A nucleus for a web of open data[M].Springer Berlin Heidelberg,2007:722-735
[3] Apweiler R,Bairoch A,Wu C H,et al.UniProt:the universal protein knowledgebase[J].Nucleic Acids Research,2004,32(suppl 1):D115-D119
[4] Stadler C,Lehmann J,Hffner K,et al.Linkedgeodata:A core for a web of spatial open data[J].Semantic Web,2012,3(4):333-354
[5] Goodman E L,Jimenez E,Mizell D,et al.High-performancecomputing applied to semantic databases[M]∥The Semanic Web:Research and Applications.Springer Berlin Heidelberg,2011:31-45
[6] Long Cheng,Malik A,Kotoulas S,et al.Efficient parallel dictionary encoding for RDF data[C]∥Proceedings of the 17th International Workshop on the Web and Databases(WebDB).2014
[7] Long Cheng, Malik A, Kotoulas S, et al.Scalable RDF DataCompression using X10[J/OL].http://rian.ie/ga/item/viem/109810.html
[8] Urbani J,Maassen J,Drost N,et al.Scalable RDF data compression with MapReduce[J].Concurrency and Computation:Practice and Experience,2013,25(1):24-39
[9] Wu Bu-wen,Jin Hai,Yuan Ping-peng.Scalable SAPRQL que-rying processing on large RDF data in cloud computing environment[M]∥Pervasive Computing and the Networked World.Springer Berlin Heidelberg,2013:631-646
[10] Liu Liu,Yin Jiang-tao,Gao Li-xin.Efficient social network data query processing on mapreduce[C]∥Proceedings of the 5th ACM workshop on HotPlanet.ACM,2013:27-32
[11] Lee D,Kim J S,Maeng S.Large-scale incremental processing with MapReduce[J].Future Generation Computer Systems,2014,36:66-79
[12] Thomas H,Cormen Charles E,Leiserson Ronald L,et al.Introduction to Algorithms(第3版)[M].殷建平,徐云,等译.北京:机械工业出版社,2013:593-599
[13] Guo Yuan-bo,Pan Zheng-xiang, Heflin J.LUBM:A benchmark for OWL knowledge base systems[J].Web Semantics:Science,Services and Agents on the World Wide Web,2005,3(2):158-182

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!