计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 283-292.doi: 10.11896/jsjkx.211000077

• 人工智能 • 上一篇    下一篇

标签约束图上的k步可达性查询

杜明, 邢瑞萍, 周军锋, 谭玉婷   

  1. 东华大学计算机科学与技术学院 上海201620
  • 收稿日期:2021-10-11 修回日期:2022-06-07 发布日期:2022-12-14
  • 通讯作者: 邢瑞萍(1017033346@qq.com)
  • 作者简介:(duming@dhu.edu.cn)
  • 基金资助:
    上海市自然科学基金(20ZR1402700);国家自然科学基金(61472339,61572421,61272124)

k-step Reachability Query Processing on Label-constrained Graph

DU Ming, XING Rui-ping, ZHOU Jun-feng, TAN Yu-ting   

  1. School of Computer Science and Technology,Donghua University,Shanghai 201620,China
  • Received:2021-10-11 Revised:2022-06-07 Published:2022-12-14
  • About author:DU Ming,born in 1975,Ph.D,associate professor.His main research interests include natural language processing,information service,and data analysis.XING Rui-ping,born in 1997,postgra-duate.Her main research interests include graph reachability and so on.
  • Supported by:
    Natural Science Foundation of Shanghai,China(20ZR1402700) and National Natural Science Foundation of China(61472339,61572421,61272124).

摘要: 标签约束图上的k步可达性查询问题,回答了在一个标签约束图上两点之间是否存在一条长度不大于k的路径并且这条路径上的标签都在用户给定的标签集中的问题。标签约束图上的k步可达性查询问题在现实中有着广泛的应用,然而现有算法无法直接回答这个问题。因此,首先提出LK2H算法。LK2H算法主要包括构建索引和查询两个步骤。第一步是给图上的所有顶点构建一组包含k和标签信息的2-Hop索引,第二步是基于构建好的索引进行查询。在查询时,为了尽可能地为用户返回更多的信息,LK2H算法优化了一类不可达查询的返回结果:当用户无法明确所有的标签类型,不能给出完整的标签约束,进而导致查询结果为不可达时,将完整的标签集返回给用户。其次,提出优化算法LK2H+。LK2H+算法通过构建部分顶点的2-Hop索引进一步缩减索引大小和索引的构建时间,并基于构建好的索引进行查询。查询时,需要对顶点按照是否构建了索引进行分类讨论。最后,基于15个真实数据集进行测试。实验结果表明,LK2H算法和LK2H+算法都可以高效地解决标签约束图上的k步可达性查询问题。

关键词: 标签约束图, k步可达性查询, 2-Hop索引, 顶点覆盖, 图论

Abstract: The k-step reachability query processing on label-constrained graph is used to answer whether there is a path with a length not greater than k between two points and the labels on this path are in the specified label set.The k-step reachability query processing on label-constrained graph is widely used in reality,but there is no relevant algorithm to answer it.Therefore,the LK2H algorithm is proposed firstly.LK2H algorithm mainly consists of two steps.The first step is to build a pair of 2-Hop in-dexes containing k and label information for all vertices on the graph,and the second step is querying based on the built index.In order to return information as much as possible to the user,LK2H optimizes the results of a type of unreachable query:when the user cannot specify all the label types,and cannot give full label constraints resulting in unreachable query results,the complete label set will return to the user.Secondly,an optimization algorithm,LK2H+,is proposed.LK2H+ algorithm further reduces the index size and time of construction by building a 2-Hop index for part of vertices,and queries based on the built index.Queries require a discussion of whether query vertices are indexed or not.Finally,the test is conducted based on 15 real-world datasets.Experiment results show that both LK2H and LK2H+ algorithms can solve k-step reachability query processing on label constraint graphs efficiently and quickly.

Key words: Label-constrained graph, k-step reachability query, 2-hop index, Vertex cover, Graph theory

中图分类号: 

  • TP301
[1]KUMAR R,ALEXANDER T,FALOUTSOS C,et al.SocialNetworks:Looking ahead [C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Las Vegas,USA:ACM,2008:1060-1060.
[2]STANN F,HEIDEMANN J.RMST:Reliable data transport in sensor networks[C]//Proceedings of the First IEEE International Workshop on Sensor Network Protocols and Applications.Anchorage,AK,USA:IEEE,2003:102-112.
[3]WOOD P T.Query languages for graph databases[J].ACM Sigmod,2012,41(1):50-60.
[4]CHEN X,LAI L,QIN L,et al.Structsim:Querying structuralnode similarity at billion scale[C]//Proceedings of the IEEE 36th International Conference on Data Engineering(ICDE).Dallas,TX,USA:IEEE,2020:1950-1953.
[5]CHEN Y J,CHEN Y B.An efficient algorithm for answeringgraph reachability queries[C]//Proceedings of the IEEE 24th International Conference on Data Engineering.Cancun,Mexico:IEEE,2008:893-902.
[6]CHENG J,HUANG S L,WU H H,et al.TF-Label:a topological-folding labeling scheme for reachability querying in a large graph[C]//Proceedings of the 2013 ACM SIGMOD Interna-tional Conference on Management of Data.New York,USA:ACM,2013:193-204.
[7]FANG Y,YANG Y,ZHANG W,et al.Effective and efficient community search over large heterogeneous information networks[J].PVLDB,2020,13(6):2093-2107.
[8]WANG K,LIN X,QIN L,et al.Efficient bitruss decomposition for large-scale bipartite graphs[C]//Proceedings of the IEEE 36th International Conference on Data Engineering(ICDE).Dallas,TX,USA:IEEE,2020:661-672.
[9]SCHAIK S J,MOOR O D.A memory efficient reachability data structure through bit vector compression[C]//Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data.Athens,Greece:ACM,2011:913-924.
[10]IMANE H,YAHIAOUI S,BENDJOUDI A,et al.Reachability in Big Graphs:A Distributed Indexing and Querying Approach[J].Information Sciences,2021,573:541-561.
[11]WEI H,YU J X,LU C,et al.Reachability querying:An independent permutation labeling approach[J].VLDB,2018,27(1):1-26.
[12]YU J X,CHENG J F.Graph reachability queries:A survey[M]//Managing and Mining Graph Data.Boston,MA:Sprin-ger,2010:181-215.
[13]CHOUDHARY P.A survey on social network analysis forcounterterrorism[J].International Journal of Computer Applications,2015,112(9):24-29.
[14]PENG Y,ZHANG Y,LIN X,et al.Answering Billion-Scale Label-Constrained Reachability Queries within Microsecond[J].Proceedings of the VLDB Endowment,2020,13(6):812-825.
[15]VALSTAR L D J,FLETCHER G H L,YOSHIDA Y.Land-mark indexing for evaluation of label-constrained reachability queries[C]//Proceedings of the 2017 ACM International Conference on Management of Data.Chicago,Illinois,USA:ACM,2017:345-358.
[16]XIAN T,CHEN Z Y,LI K,et al.Efficient computation of the transitive closure size[J].Cluster Computing,2019,22(3):6517-6527.
[17]YANO Y,AKIBA T,IWATA Y,et al.Fast and scalable rea-chability queries on graphs by pruned labeling with landmarks and paths[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management.San Francisco,CA,USA:ACM,2013:1601-1606.
[18]BOSTJAN B,FRANTISEK K,JAN K,et al.Minimum k-pathvertex cover[J].Discrete Applied Mathematics,2011,159(12):1189-1195.
[19]VELOSO R,JUNIOR W M,CERF L P,et al.Reachability queries in very large graphs:A fast refined online search approach[C]//Proceeding of the 17th International Conference on Extending Database Technology.Athens,Greece:EDBT,2014:511-522.
[20]CHENG J,SHANG Z,CHENG H,et al.K-reach:who is in your small world[J].Proceedings of the VLDB Endowment,2012,5(11):1292-1303.
[21]JIN R M,HONG H,WANG H X,et al.Computing label-constraint reachability in graph databases [C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of data.Indianapolis,Indiana,USA:ACM,2010:123-134.
[22]HASSAN M S,AREF W G,ALY A M.Graph indexing forshortest-path finding over dynamic sub-graphs[C]//Procee-dings of the 2016 International Conference on Management of Data.San Francisco,CA,USA:ACM,2016:1183-1197.
[23]SARISHT W,PRASAD A,RANU S,et al.Efficiently answe-ring regular simple path queries on large labeled network[C]//Proceedings of the 2019 International Conference on Management of Data.2019:1463-1480.
[24]CHEN Y J,SINGH G.Graph Indexing for Efficient Evaluation of Label-constrained Reachability Queries[J].ACM Transactions on Database Systems,2021,46(2):1-50.
[25]BEAMER S,ASANOVIC K,PATTERSON D.Direction-opti-mizing Breadth-first search[J].Scientific Programming,2013,21(3/4):137-148.
[26]STEIER D M,ANDERSON A P.Depth-First Search [M]//Algorithm Synthesis:A Comparative Study.US:Springer,1989.
[27]WANG X,LIN K,DU M,et al.Efficient k-Hop Reachability Queries Processing on Directed Graphs[D].Shanghai:Donghua University,2020.
[1] 程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝.
基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法
Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory
计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202
[2] 高琳, 段国林, 姚涛.
基于图论的组织互操作性建模与评估研究
Research on Organizational Interoperability Modeling and Evaluation Based on Graph Theory
计算机科学, 2020, 47(6A): 572-576. https://doi.org/10.11896/JsJkx.190900114
[3] 李金泽,徐喜荣,潘子琦,李晓杰.
改进的自适应谱聚类NJW算法
Improved Adaptive Spectral Clustering NJW Algorithm
计算机科学, 2017, 44(Z6): 424-427. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.095
[4] 王振朝,赵云,薛文玲.
下含D2D蜂窝网基于有向加权二部图的资源分配
Resource Allocation for D2D Communications Underlaying Cellular Networks Using Directed Weighted Bipartite
计算机科学, 2017, 44(9): 120-124. https://doi.org/10.11896/j.issn.1002-137X.2017.09.024
[5] 王振朝,赵云,薛文玲.
蜂窝下含D2D系统基于二部超图的资源分配
Resource Allocation for D2D Communication Underlaid Cellular Networks Using Bipartite Hypergraph
计算机科学, 2017, 44(8): 82-85. https://doi.org/10.11896/j.issn.1002-137X.2017.08.015
[6] 李丽萍,赵传荣,孔德仁,王芳.
基于图论的无监督区域遥感图像检索算法研究
Research on Unsupervised Regional Remote Sensing Image Retrieval Algorithm Based on Graph Theory
计算机科学, 2017, 44(7): 315-317. https://doi.org/10.11896/j.issn.1002-137X.2017.07.057
[7] 沈洪,李晓光.
图像显著估计的并行算法研究
Research on Parallel Algorithm of Image Saliency Estimation
计算机科学, 2017, 44(12): 266-273. https://doi.org/10.11896/j.issn.1002-137X.2017.12.048
[8] 罗卫东,王建新,冯启龙.
最大圈分解问题的研究进展
Survey of Cycle Packing Problem
计算机科学, 2017, 44(1): 1-6. https://doi.org/10.11896/j.issn.1002-137X.2017.01.001
[9] 吴从中,李俊.
基于SUSAN边缘信息的阈值分割算法
Image Threshold Segmentation Algorithm Based on SUSAN Edge Information
计算机科学, 2015, 42(Z11): 119-122.
[10] 黄治国,李娜.
基于分辨函数的极大团搜索算法
Discernibility Function-based Algorithm for Finding All Maximal Cliques
计算机科学, 2014, 41(4): 248-251.
[11] 刘雅坤,于双元,罗四维.
基于最小最大割算法的阈值分割算法
Threshold Image Segmentation Based on Min-max Cut Algorithm
计算机科学, 2014, 41(1): 95-99.
[12] 李岳洪,万频,王永华,邓钦,杨健.
改进的细菌觅食算法求解认知无线网络频谱分配问题
Cognitive Radio Spectrum Assignment Based on Binary Bacterial Foraging Optimization Algorithm
计算机科学, 2013, 40(8): 49-52.
[13] 任晓玲,白 雪,刘希玉.
粘贴模型在两类特殊问题中的改进算法研究
Improved Algorithm of Sticker Model in Two Special Problem
计算机科学, 2012, 39(Z11): 252-255.
[14] 张宏烈,张国印,丛万锁,胡海燕.
一种应用图论方法管理可重构资源的策略
One Management Strategy of Reconfigurable Resource Using Graph Theory
计算机科学, 2010, 37(12): 270-274.
[15] .
一种新的基于图论聚类的分割算法

计算机科学, 2008, 35(9): 245-247.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!