计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 169-176.doi: 10.11896/jsjkx.240200105

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于强连通分量的最短环计数索引

杨迎, 周军锋, 杜明   

  1. 东华大学计算机科学与技术学院 上海 201600
  • 收稿日期:2024-02-27 修回日期:2024-06-26 出版日期:2025-04-15 发布日期:2025-04-14
  • 通讯作者: 周军锋(zhoujf@dhu.edu.cn)
  • 作者简介:(2212623@mail.dhu.edu.cn)
  • 基金资助:
    国家自然科学基金(62372101,61873337,62272097)

Index for Counting the Shortest Cycle Based on Strongly Connected Components

YANG Ying, ZHOU Junfeng, DU Ming   

  1. School of Computer Science and Technology,Donghua University,Shanghai 201600,China
  • Received:2024-02-27 Revised:2024-06-26 Online:2025-04-15 Published:2025-04-14
  • About author:YANG Ying,born in 1999,postgra-duate.Her main research interests include reachability and shortest paths and so on.
    ZHOU Junfeng,born in 1977,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.16083M).His main research interests include information retrieval technology,semi-structured data,query processing and optimization of graphic data.
  • Supported by:
    National Natural Science Foundation of China(62372101,61873337,62272097).

摘要: 最短环计数是图分析的一种基本模式。经过某个顶点的最短环计数指经过该顶点且长度最短的环的数目。在现实生活中,最短环计数应用十分广泛,如欺诈交易检测、罪犯预筛选以及文件共享优化等。针对现有方法索引空间较大、查询效率较低等问题,研究如何在原始图上构建最短环计数索引,提出了一种针对最短环计数且无需进行图转换操作的STC索引(Trough Shortest Cycle Counting Index)。该索引根据最短环的特征对其进行分类,针对不同类型的最短环分别构建不同的索引信息,能够直接基于原始图构造索引,并且在保证索引规模不扩大、索引构造时间不增加的前提下,进一步提升查询效率。此外,根据环与强连通分量的特殊关系,提出了基于强连通分量的索引策略,通过在强连通分量内部构造最短环计数索引,可以进一步提升索引构造效率,有效减小索引规模,提升查询效率。在10个真实数据集上进行了实验。实验结果验证了所提出的STC索引的高效性,以及基于强连通分量的策略可以有效减小索引空间,提升索引构造以及查询效率。

关键词: 图分析, 最短环, 最短环计数, 2-hop索引, 强连通分量

Abstract: The shortest cycle counting is a basic pattern of graph analysis.The shortest cycle counting through a certain vertex refers to the number of shortest cycles passing through the vertex.In real life,the shortest cycle counting has a wide range of applications,such as fraudulent transaction detection,criminal pre-screening,and file sharing optimization.In response to the problems of large indexing space and low query efficiency of existing methods,this paper studies the construction of shortest cycle counting indexing on the original graph,and proposes a trough shortest cycle counting index(STC index) that does not require graph transformation operations.The index classifies the shortest cycles according to their characteristics,constructs different indexing information for different types of shortest cycles,and can directly construct indexes based on the original graph,and further improve query efficiency while ensuring that the indexing size does not expand and the indexing construction time does not increase.In addition,based on the special relationship between cycles and strongly connected components,this paper proposes an indexing strategy based on strongly connected components.By constructing shortest cycle counting indexing within strongly connected components,it can further improve the efficiency of indexing construction,effectively reduce the index size,and improve query efficiency.Experiments on 10 real datasetsverify the efficiency of the proposed STC index and the strategy based on strongly connected components,which can effectively reduce the indexing space and improve the efficiency of indexing construction and querying.

Key words: Graph analysis, Shortest cycle, Shortest cycle counting, 2-hop index, Strongly connected components

中图分类号: 

  • TP301
[1]FENG Q,PENG Y,ZHANG W,et al.Towards Real-TimeCounting Shortest Cycles on Dynamic Graphs:A Hub Labeling Approach[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:512-524.
[2]PENG Y,ZHANG Y,LIN X,et al.Answering billion-scale label-constrained reachability queries within microsecond[J].Proceedings of the VLDB Endowment,2020,13(6):812-825.
[3]LAI Z,PENG Y,YANG S,et al.PEFP:Efficient k-hop Constrained s-t Simple Path Enumeration on FPGA[C]//2021 IEEE 37th International Conference on Data Engineering(ICDE).IEEE,2021:1320-1331.
[4]PENG Y,LIN X,ZHANG Y,et al.Efficient Hop-constrained s-t Simple Path Enumeration[J].The VLDB Journal,2021,30(5):799-823.
[5]PENG Y,ZHAO W,ZHANG W,et al.DLQ:A System for Label-Constrained Reachability Queries on Dynamic Graphs[C]//Proceedings of the 30th ACM International Conference on Information & Knowledge Management.2021:4764-4768.
[6]LI C,KYNG R,YANG P.Maximilian Probst Gutenberg:Al-most-Linear Time Algorithms for Incremental Graphs:Cycle Detection,SCCs,s-t Shortest Path,and Minimum-Cost Flow[J].arXiv:2311.18295,2023.
[7]QIU X,CEN W,QIAN Z,et al.Real-time constrained cycle detection in large dynamic graphs[J].Proceedings of the VLDB Endowment,2018,11(12):1876-1888.
[8]PENG Y,ZHANG Y,LIN X,et al.Towards bridging theory and practice:hop-constrained s-t simple path enumeration[J].Proceedings of the VLDB Endowment,2019,13(4):463-476.
[9]BODAGHI A,TEIMOURPOUR B.Automobile Insurance FraudDetection Using Social Network Analysis[M]//Applications of Data Management and Analysis,Lecture Notes in Social Networks.2018:11-16.
[10]HAJDU L,KRÉSZ M.Temporal Network Analytics for Fraud Detection in the Banking Sector[C]//International Conference on Theory and Practice of Digital Libraries.2020:145-157.
[11]YUE D,WU X,WANG Y,et al.A Review of Data Mining-Based Financial Fraud Detection Research[C]//2007 International Conference on Wireless Communications,Networking and Mobile Computing.IEEE,2007:5519-5522.
[12]SCHLOSSER M,CONDIE T,KAMVAR S.Simulating a file-sharing p2p network[C]//1st Workshop on Semantics in Grid and P2P Networks.Stanford InfoLab,2003.
[13]OHTA C,GE Z,GUO Y,et al.Index-server optimization for P2P file sharing in mobile ad hoc networks[J].Ite Technical Report,2004,28(493):960-966.
[14]ZHANG Y,YU J X.Hub Labeling for Shortest Path Counting[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:1813-1828.
[15]WANG Y Q,YUAN L,CHEN Z,et al.Towards Efficient Shortest Path Counting on Billion-Scale Graphs[C]//2023 IEEE 39th International Conference on Data Engineering(ICDE).IEEE,2023:2579-2592.
[16]PENG Y,YU J,WANG S B.PSPC:Efficient Parallel ShortestPath Counting on Large-Scale Graphs[C]//2023 IEEE 39th International Conference on Data Engineering(ICDE).IEEE,2023:896-908.
[17]FENG Q S,PENG Y,ZHANG W J,et al.DSPC:Efficiently Answering Shortest Path Counting on Dynamic Graphs[J].arXiv:2307.05918,2023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!