计算机科学 ›› 2014, Vol. 41 ›› Issue (7): 227-231.doi: 10.11896/j.issn.1002-137X.2014.07.047
汪璟玢,方知立,张燕琴
WANG Jing-bin,FANG Zhi-li and ZHANG Yan-qin
摘要: 采用分布式来实现SPARQL(Simple Protocol and RDF Query Language)查询是解决海量RDF(Resource Description Framework)查询的一种新思路。目前实现的基于Hadoop的RDF查询都要启用多个MapReduce来完成任务, 浪费时间。为了克服此缺点,提出MRQJ(using MapReduce to query and join)算法,用以实现SPARQL的分布式查询。该算法分为连接计划生成与SPARQL查询执行两个部分:连接计划生成采用贪心策略,生成最优的连接方案;在SPARQL查询执行中只需结合一次MapReduce计算即可得到查询结果。在LUBM数据集上进行的测试实验表明:在查询语句较为复杂的情况下,MRQJ方法的查询效率具有明显的优势。
[1] 李慧颖,瞿裕忠.基于关键词的语义网数据查询研究综述[J].计算机科学,2011,8(7):18-23 [2] 金强.基于Hase的RDF存储系统的研究与设计[D].杭州:浙江大学,2011 [3] Li L,Song Y.Distributed Storage of Massive RDF Data UsingHBase[J].Journal of Communication and Computer,2011,8(5):325-328 [4] Sun J,Jin Q.Scalable rdf store based on hbase and mapreduce[C]∥20103rd International Conference on Advanced Computer Theory and Engineering(ICACTE).IEEE,2010:633-636 [5] Husain M F,Doshi P,Khan L,et al.Storage and retrieval oflarge rdf graph using hadoop and mapreduce[M]∥Cloud Computing.Springer Berlin Heidelberg,2009:680-686 [6] Myung J,Yeon J,Lee S G.SPARQL Basie Graph Pattern Processing with Iterative MapReduce[C]∥Proceedings of the Workshop on Massive Data Analytics on the Cloud(MDAC’10).2010:6-12 [7] Husain M,McGlothlin J,Masud M M,et al.Heuristics-BasedQuery Processing for Large RDF Graphs Using Cloud Computing[J].IEEE Transactions on Knowledge and Data Engine-ering,2011,23(9):1312-1327 [8] Cheng J,Wang W,Gao R.Massive RDF Data Complicated Que-ry Optimization Based on MapReduce[J].Physics Procedia,2012,25:1414-1419 [9] Liu L,Yin J,Gao L.Efficient Social Network Data Query Processing on MapReduce[C]∥Proc of the 5th ACM workshop.New York:ACM,2013:27-32 [10] 张伟奇,张坤龙.基于关系型数据库的RDF存储引擎[D].天津:天津大学,2011 [11] 吴刚.RDF图数据管理的关键技术研究[D].北京:清华大学,2008 [12] 刘翔宇,吴刚.基于Prüfer序列的RDF数据索引与查询[J].计算机学报,2011,4(10) [13] Dean J,Ghemawat S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113 |
No related articles found! |
|