计算机科学 ›› 2019, Vol. 46 ›› Issue (6A): 402-406.
徐哲, 刘亮, 秦小麟, 秦伟萌
XU Zhe, LIU Liang, QIN Xiao-lin, QIN Wei-meng
摘要: 移动互联网、物联网的快速发展产生了大量带关系属性的空间文本对象数据。面向网页文本数据的搜索引擎仅支持文本关键词查询,无法处理包含地理位置信息、文本信息、关系属性的混合数据。现有面向空间关键字的查询处理技术未将关系属性作为过滤条件,且是基于单机实现的,无法满足查询性能的要求。为解决上述问题,提出了一种新颖的将关系属性、空间和关键字3种属性映射成文本数据的Baseline算法(Baseline Algorithm of Distributed Keywords and Location-aware with Relational Attributes Query,BADKLRQ),利用分布式倒排文本索引对转换后的文本数据进行并行索引。针对带关系属性、空间和关键字的查询请求,将查询请求转换成映射空间中的多个文本关键字,对转换后的文本数据进行查询,并提出基于Baseline算法的改进算法MGDKLRQ,以改进空间属性转换成文本关键字的算法。实验结果表明,在索引时间和查询时间上,BADKLRQ算法比现有算法提升了10%~15%,MGDKLRQ算法比现有算法提升了20%~30%。
中图分类号:
[1]WU D,CONG G,JENSEN C S.A framework for efficient spatial web object retrieval[M].Springer-Verlag New York,Inc.2012. [2]CONG G,JENSEN C S,WU D.Efficient retrieval of the top-k most relevant spatial web objects[C]∥Proceedings of the VLDB Endowment.2009:337-348. [3]AHN J,JO B,JUNG S.Multiple Domain-Based Spatial Keyword Query Processing Method Using Collaboration of Multiple IR-Trees[C]∥Proceedings of the 7th International Conference on Emerging Databases.Springer,Singapore,2018:183-192. [4]ZHENG K,SU H,ZHENG B,et al.Interactive top-k spatial keyword queries[C]∥2015 IEEE 31st International Conference on Data Engineering (ICDE).IEEE,2015:423-434. [5]ZHANG D,CHAN C Y,TAN K L.Processing spatial keyword query as a top-k aggregation query[C]∥Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval.ACM,2014:355-364. [6]LIU X,CHEN L,WAN C.LINQ:A framework for location-aware indexing and query processing[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(5):1288-1300. [7]GUTTMAN A.R-trees:a dynamic index structure for spatial searching[J].ACM SIGMOD Record,2016,14(2):47-57. [8]ZANG X,HAO P,GAO X,et al.QDR-Tree:An Efcient Index Scheme for Complex Spatial Keyword Query[J].arXiv preprint arXiv:1804.10726,2018. [9]JUNG H R,YONG S K,CHUNG Y D.QR-tree:An efficient and scalable method for evaluation of continuous range queries[J].Information Sciences,2014,274(8):156-176. [10]NAIR S H,SINHA A,VACHHANI L.Hilbert’s space-filling curve for regions with holes[C]∥2017 IEEE 56th Annual Conference on Decision and Control (CDC).IEEE,2017:313-319. [11]CHEN Y Y,SUEL T,MARKOWETZ A.Efficient query processing in geographic web search engines[C]∥Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data.ACM,2006:277-288. [12]KHODAEI A,SHAHABI C,LI C.Hybrid indexing and seamless ranking of spatial and textual features of web documents[C]∥International Conference on Database and Expert Systems Applications.Springer-Verlag,2010:450-466. [13]SANTOKI K.Indexing and Searching on a Hadoop Distributed File System[EB/OL].http://www.drdobbs.com/parallel/indexing-and_searching-on-a-hadoop-distr/226300241?pgno=3. [14]CHRISTOFORAKI M,HE J,DIMOPOULOS C,et al.Text vs.space:efficient geo-search query processing[C]∥Proceedings of the 20th ACM International Conference on Information and Knowledge Management.ACM,2011:423-432. [15]GÖBEL R,HENRICH A,NIEMANN R,et al.A hybrid index structure for geo-textual searches[C]∥Proceedings of the 18th ACM conference on Information and Knowledge Management.ACM,2009:1625-1628. [16]WU D,MAN L Y,JENSEN C S,et al.Efficient continuously moving top-k spatial keyword query processing[C]∥IEEE,International Conference on Data Engineering.IEEE,2011:541-552. |
[1] | 郭帅,刘亮,秦小麟. 用户偏好约束的空间关键词范围查询处理方法 Spatial Keyword Range Query with User Preferences Constraint 计算机科学, 2018, 45(4): 182-189. https://doi.org/10.11896/j.issn.1002-137X.2018.04.031 |
[2] | 董天阳, 尚跃辉, 程强. 方向感知的路网移动对象范围查询算法 Direction-aware Moving Object Range Query Algorithm in Road Network 计算机科学, 2018, 45(11): 210-219. https://doi.org/10.11896/j.issn.1002-137X.2018.11.033 |
[3] | 袁鑫攀,汪灿飞,龙军,彭成. CS-Chord:基于聚类分离的分布式高维向量索引 CS-Chord:Distributed High Dimensional Vector Index Based on Clustering Separation 计算机科学, 2017, 44(Z11): 494-497. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.105 |
[4] | 刘怀进,陈永红,田晖,王田,蔡奕侨. 两层无线传感网中节能的安全范围查询方法 Privacy and Integrity Protection Range Query Processing in Two-tiered Wireless Sensor Networks 计算机科学, 2016, 43(Z11): 393-397. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.090 |
[5] | 李艳红,黄群,蒋宏,李国徽. 路网中空间关键字连续范围查询算法研究 Research on Processing Continuous Spatial Keyword Range Queries in Road Networks 计算机科学, 2014, 41(7): 232-235. https://doi.org/10.11896/j.issn.1002-137X.2014.07.048 |
[6] | 唐李洋,倪志伟,李应. 基于Cassandra的可扩展分布式反向索引的构建 Scalable Distributed Inverted Index Built on Cassandra 计算机科学, 2011, 38(6): 187-190. |
[7] | 吴炜,苏永红,李瑞轩,卢正鼎. 基于DHT的分布式索引技术研究与实现 Research and Implementation of Distributed Index Based on DHT 计算机科学, 2010, 37(2): 65-70. |
[8] | 吴凌坤 汤庸 王鹏 舒然. SA:一种有利于多属性范围查询的多维聚簇方法 计算机科学, 2009, 36(6): 133-137. |
[9] | 傅向华 彭小刚 王志强 明仲. 基于分布式范围树的结构化P2P多维范围查询 计算机科学, 2007, 34(8): 69-71. |
[10] | 师智斌 黄厚宽. 数据立方体聚集范围查询分块方法研究 计算机科学, 2007, 34(12): 93-96. |
|