计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 169-176.doi: 10.11896/jsjkx.240200105
杨迎, 周军锋, 杜明
YANG Ying, ZHOU Junfeng, DU Ming
摘要: 最短环计数是图分析的一种基本模式。经过某个顶点的最短环计数指经过该顶点且长度最短的环的数目。在现实生活中,最短环计数应用十分广泛,如欺诈交易检测、罪犯预筛选以及文件共享优化等。针对现有方法索引空间较大、查询效率较低等问题,研究如何在原始图上构建最短环计数索引,提出了一种针对最短环计数且无需进行图转换操作的STC索引(Trough Shortest Cycle Counting Index)。该索引根据最短环的特征对其进行分类,针对不同类型的最短环分别构建不同的索引信息,能够直接基于原始图构造索引,并且在保证索引规模不扩大、索引构造时间不增加的前提下,进一步提升查询效率。此外,根据环与强连通分量的特殊关系,提出了基于强连通分量的索引策略,通过在强连通分量内部构造最短环计数索引,可以进一步提升索引构造效率,有效减小索引规模,提升查询效率。在10个真实数据集上进行了实验。实验结果验证了所提出的STC索引的高效性,以及基于强连通分量的策略可以有效减小索引空间,提升索引构造以及查询效率。
中图分类号:
[1]FENG Q,PENG Y,ZHANG W,et al.Towards Real-TimeCounting Shortest Cycles on Dynamic Graphs:A Hub Labeling Approach[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:512-524. [2]PENG Y,ZHANG Y,LIN X,et al.Answering billion-scale label-constrained reachability queries within microsecond[J].Proceedings of the VLDB Endowment,2020,13(6):812-825. [3]LAI Z,PENG Y,YANG S,et al.PEFP:Efficient k-hop Constrained s-t Simple Path Enumeration on FPGA[C]//2021 IEEE 37th International Conference on Data Engineering(ICDE).IEEE,2021:1320-1331. [4]PENG Y,LIN X,ZHANG Y,et al.Efficient Hop-constrained s-t Simple Path Enumeration[J].The VLDB Journal,2021,30(5):799-823. [5]PENG Y,ZHAO W,ZHANG W,et al.DLQ:A System for Label-Constrained Reachability Queries on Dynamic Graphs[C]//Proceedings of the 30th ACM International Conference on Information & Knowledge Management.2021:4764-4768. [6]LI C,KYNG R,YANG P.Maximilian Probst Gutenberg:Al-most-Linear Time Algorithms for Incremental Graphs:Cycle Detection,SCCs,s-t Shortest Path,and Minimum-Cost Flow[J].arXiv:2311.18295,2023. [7]QIU X,CEN W,QIAN Z,et al.Real-time constrained cycle detection in large dynamic graphs[J].Proceedings of the VLDB Endowment,2018,11(12):1876-1888. [8]PENG Y,ZHANG Y,LIN X,et al.Towards bridging theory and practice:hop-constrained s-t simple path enumeration[J].Proceedings of the VLDB Endowment,2019,13(4):463-476. [9]BODAGHI A,TEIMOURPOUR B.Automobile Insurance FraudDetection Using Social Network Analysis[M]//Applications of Data Management and Analysis,Lecture Notes in Social Networks.2018:11-16. [10]HAJDU L,KRÉSZ M.Temporal Network Analytics for Fraud Detection in the Banking Sector[C]//International Conference on Theory and Practice of Digital Libraries.2020:145-157. [11]YUE D,WU X,WANG Y,et al.A Review of Data Mining-Based Financial Fraud Detection Research[C]//2007 International Conference on Wireless Communications,Networking and Mobile Computing.IEEE,2007:5519-5522. [12]SCHLOSSER M,CONDIE T,KAMVAR S.Simulating a file-sharing p2p network[C]//1st Workshop on Semantics in Grid and P2P Networks.Stanford InfoLab,2003. [13]OHTA C,GE Z,GUO Y,et al.Index-server optimization for P2P file sharing in mobile ad hoc networks[J].Ite Technical Report,2004,28(493):960-966. [14]ZHANG Y,YU J X.Hub Labeling for Shortest Path Counting[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.2020:1813-1828. [15]WANG Y Q,YUAN L,CHEN Z,et al.Towards Efficient Shortest Path Counting on Billion-Scale Graphs[C]//2023 IEEE 39th International Conference on Data Engineering(ICDE).IEEE,2023:2579-2592. [16]PENG Y,YU J,WANG S B.PSPC:Efficient Parallel ShortestPath Counting on Large-Scale Graphs[C]//2023 IEEE 39th International Conference on Data Engineering(ICDE).IEEE,2023:896-908. [17]FENG Q S,PENG Y,ZHANG W J,et al.DSPC:Efficiently Answering Shortest Path Counting on Dynamic Graphs[J].arXiv:2307.05918,2023. |
|