计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 283-292.doi: 10.11896/jsjkx.211000077
杜明, 邢瑞萍, 周军锋, 谭玉婷
DU Ming, XING Rui-ping, ZHOU Jun-feng, TAN Yu-ting
摘要: 标签约束图上的k步可达性查询问题,回答了在一个标签约束图上两点之间是否存在一条长度不大于k的路径并且这条路径上的标签都在用户给定的标签集中的问题。标签约束图上的k步可达性查询问题在现实中有着广泛的应用,然而现有算法无法直接回答这个问题。因此,首先提出LK2H算法。LK2H算法主要包括构建索引和查询两个步骤。第一步是给图上的所有顶点构建一组包含k和标签信息的2-Hop索引,第二步是基于构建好的索引进行查询。在查询时,为了尽可能地为用户返回更多的信息,LK2H算法优化了一类不可达查询的返回结果:当用户无法明确所有的标签类型,不能给出完整的标签约束,进而导致查询结果为不可达时,将完整的标签集返回给用户。其次,提出优化算法LK2H+。LK2H+算法通过构建部分顶点的2-Hop索引进一步缩减索引大小和索引的构建时间,并基于构建好的索引进行查询。查询时,需要对顶点按照是否构建了索引进行分类讨论。最后,基于15个真实数据集进行测试。实验结果表明,LK2H算法和LK2H+算法都可以高效地解决标签约束图上的k步可达性查询问题。
中图分类号:
[1]KUMAR R,ALEXANDER T,FALOUTSOS C,et al.SocialNetworks:Looking ahead [C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Las Vegas,USA:ACM,2008:1060-1060. [2]STANN F,HEIDEMANN J.RMST:Reliable data transport in sensor networks[C]//Proceedings of the First IEEE International Workshop on Sensor Network Protocols and Applications.Anchorage,AK,USA:IEEE,2003:102-112. [3]WOOD P T.Query languages for graph databases[J].ACM Sigmod,2012,41(1):50-60. [4]CHEN X,LAI L,QIN L,et al.Structsim:Querying structuralnode similarity at billion scale[C]//Proceedings of the IEEE 36th International Conference on Data Engineering(ICDE).Dallas,TX,USA:IEEE,2020:1950-1953. [5]CHEN Y J,CHEN Y B.An efficient algorithm for answeringgraph reachability queries[C]//Proceedings of the IEEE 24th International Conference on Data Engineering.Cancun,Mexico:IEEE,2008:893-902. [6]CHENG J,HUANG S L,WU H H,et al.TF-Label:a topological-folding labeling scheme for reachability querying in a large graph[C]//Proceedings of the 2013 ACM SIGMOD Interna-tional Conference on Management of Data.New York,USA:ACM,2013:193-204. [7]FANG Y,YANG Y,ZHANG W,et al.Effective and efficient community search over large heterogeneous information networks[J].PVLDB,2020,13(6):2093-2107. [8]WANG K,LIN X,QIN L,et al.Efficient bitruss decomposition for large-scale bipartite graphs[C]//Proceedings of the IEEE 36th International Conference on Data Engineering(ICDE).Dallas,TX,USA:IEEE,2020:661-672. [9]SCHAIK S J,MOOR O D.A memory efficient reachability data structure through bit vector compression[C]//Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data.Athens,Greece:ACM,2011:913-924. [10]IMANE H,YAHIAOUI S,BENDJOUDI A,et al.Reachability in Big Graphs:A Distributed Indexing and Querying Approach[J].Information Sciences,2021,573:541-561. [11]WEI H,YU J X,LU C,et al.Reachability querying:An independent permutation labeling approach[J].VLDB,2018,27(1):1-26. [12]YU J X,CHENG J F.Graph reachability queries:A survey[M]//Managing and Mining Graph Data.Boston,MA:Sprin-ger,2010:181-215. [13]CHOUDHARY P.A survey on social network analysis forcounterterrorism[J].International Journal of Computer Applications,2015,112(9):24-29. [14]PENG Y,ZHANG Y,LIN X,et al.Answering Billion-Scale Label-Constrained Reachability Queries within Microsecond[J].Proceedings of the VLDB Endowment,2020,13(6):812-825. [15]VALSTAR L D J,FLETCHER G H L,YOSHIDA Y.Land-mark indexing for evaluation of label-constrained reachability queries[C]//Proceedings of the 2017 ACM International Conference on Management of Data.Chicago,Illinois,USA:ACM,2017:345-358. [16]XIAN T,CHEN Z Y,LI K,et al.Efficient computation of the transitive closure size[J].Cluster Computing,2019,22(3):6517-6527. [17]YANO Y,AKIBA T,IWATA Y,et al.Fast and scalable rea-chability queries on graphs by pruned labeling with landmarks and paths[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management.San Francisco,CA,USA:ACM,2013:1601-1606. [18]BOSTJAN B,FRANTISEK K,JAN K,et al.Minimum k-pathvertex cover[J].Discrete Applied Mathematics,2011,159(12):1189-1195. [19]VELOSO R,JUNIOR W M,CERF L P,et al.Reachability queries in very large graphs:A fast refined online search approach[C]//Proceeding of the 17th International Conference on Extending Database Technology.Athens,Greece:EDBT,2014:511-522. [20]CHENG J,SHANG Z,CHENG H,et al.K-reach:who is in your small world[J].Proceedings of the VLDB Endowment,2012,5(11):1292-1303. [21]JIN R M,HONG H,WANG H X,et al.Computing label-constraint reachability in graph databases [C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of data.Indianapolis,Indiana,USA:ACM,2010:123-134. [22]HASSAN M S,AREF W G,ALY A M.Graph indexing forshortest-path finding over dynamic sub-graphs[C]//Procee-dings of the 2016 International Conference on Management of Data.San Francisco,CA,USA:ACM,2016:1183-1197. [23]SARISHT W,PRASAD A,RANU S,et al.Efficiently answe-ring regular simple path queries on large labeled network[C]//Proceedings of the 2019 International Conference on Management of Data.2019:1463-1480. [24]CHEN Y J,SINGH G.Graph Indexing for Efficient Evaluation of Label-constrained Reachability Queries[J].ACM Transactions on Database Systems,2021,46(2):1-50. [25]BEAMER S,ASANOVIC K,PATTERSON D.Direction-opti-mizing Breadth-first search[J].Scientific Programming,2013,21(3/4):137-148. [26]STEIER D M,ANDERSON A P.Depth-First Search [M]//Algorithm Synthesis:A Comparative Study.US:Springer,1989. [27]WANG X,LIN K,DU M,et al.Efficient k-Hop Reachability Queries Processing on Directed Graphs[D].Shanghai:Donghua University,2020. |
[1] | 程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝. 基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法 Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory 计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202 |
[2] | 高琳, 段国林, 姚涛. 基于图论的组织互操作性建模与评估研究 Research on Organizational Interoperability Modeling and Evaluation Based on Graph Theory 计算机科学, 2020, 47(6A): 572-576. https://doi.org/10.11896/JsJkx.190900114 |
[3] | 李金泽,徐喜荣,潘子琦,李晓杰. 改进的自适应谱聚类NJW算法 Improved Adaptive Spectral Clustering NJW Algorithm 计算机科学, 2017, 44(Z6): 424-427. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.095 |
[4] | 王振朝,赵云,薛文玲. 下含D2D蜂窝网基于有向加权二部图的资源分配 Resource Allocation for D2D Communications Underlaying Cellular Networks Using Directed Weighted Bipartite 计算机科学, 2017, 44(9): 120-124. https://doi.org/10.11896/j.issn.1002-137X.2017.09.024 |
[5] | 王振朝,赵云,薛文玲. 蜂窝下含D2D系统基于二部超图的资源分配 Resource Allocation for D2D Communication Underlaid Cellular Networks Using Bipartite Hypergraph 计算机科学, 2017, 44(8): 82-85. https://doi.org/10.11896/j.issn.1002-137X.2017.08.015 |
[6] | 李丽萍,赵传荣,孔德仁,王芳. 基于图论的无监督区域遥感图像检索算法研究 Research on Unsupervised Regional Remote Sensing Image Retrieval Algorithm Based on Graph Theory 计算机科学, 2017, 44(7): 315-317. https://doi.org/10.11896/j.issn.1002-137X.2017.07.057 |
[7] | 沈洪,李晓光. 图像显著估计的并行算法研究 Research on Parallel Algorithm of Image Saliency Estimation 计算机科学, 2017, 44(12): 266-273. https://doi.org/10.11896/j.issn.1002-137X.2017.12.048 |
[8] | 罗卫东,王建新,冯启龙. 最大圈分解问题的研究进展 Survey of Cycle Packing Problem 计算机科学, 2017, 44(1): 1-6. https://doi.org/10.11896/j.issn.1002-137X.2017.01.001 |
[9] | 吴从中,李俊. 基于SUSAN边缘信息的阈值分割算法 Image Threshold Segmentation Algorithm Based on SUSAN Edge Information 计算机科学, 2015, 42(Z11): 119-122. |
[10] | 黄治国,李娜. 基于分辨函数的极大团搜索算法 Discernibility Function-based Algorithm for Finding All Maximal Cliques 计算机科学, 2014, 41(4): 248-251. |
[11] | 刘雅坤,于双元,罗四维. 基于最小最大割算法的阈值分割算法 Threshold Image Segmentation Based on Min-max Cut Algorithm 计算机科学, 2014, 41(1): 95-99. |
[12] | 李岳洪,万频,王永华,邓钦,杨健. 改进的细菌觅食算法求解认知无线网络频谱分配问题 Cognitive Radio Spectrum Assignment Based on Binary Bacterial Foraging Optimization Algorithm 计算机科学, 2013, 40(8): 49-52. |
[13] | 任晓玲,白 雪,刘希玉. 粘贴模型在两类特殊问题中的改进算法研究 Improved Algorithm of Sticker Model in Two Special Problem 计算机科学, 2012, 39(Z11): 252-255. |
[14] | 张宏烈,张国印,丛万锁,胡海燕. 一种应用图论方法管理可重构资源的策略 One Management Strategy of Reconfigurable Resource Using Graph Theory 计算机科学, 2010, 37(12): 270-274. |
[15] | . 一种新的基于图论聚类的分割算法 计算机科学, 2008, 35(9): 245-247. |
|