计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 230100041-12.doi: 10.11896/jsjkx.230100041
沙雨济, 王欣, 何艳潇, 钟学燕, 方宇
SHA Yuji, WANG Xin, HE Yanxiao, ZHONG Xueyan, FANG Yu
摘要: 频繁模式挖掘(FPM)是图数据研究领域的一个经典问题,单一大图上的FPM问题近年来受到了更加广泛的关注。该问题被定义为根据用户给定的频率阈值查找在大图(Graph)中频繁出现的所有模式图(Pattern)。近年来,人们见证了FPM在多个领域的广泛应用,例如社交网络分析、欺诈检测等。然而,面对新兴的应用需求,人们需要更具语义表达力的模式图及其挖掘技术。为此,在传统模式图的基础上,首先提出了量化模式图(Quantified Graph Patterns,QGPs)——一类具有计数量词约束的模式图,实现了模式图语义的扩展;其次设计了一种在分布式场景下挖掘QGPs的算法,提出了量化图模式关联规则(Quantified Graph Pattern Association Rules,QGPARs)及其挖掘技术,用于预测(社交)网络中实体之间的潜在联系,然后利用真实图和合成图数据,通过翔实的实验验证了QGPs挖掘算法的计算效率,通过与经典链接预测方法进行对比,发现QGPARs可以取得更高的链接预测准确性;最后通过与传统图模式关联规则(Graph Pattern Association Rules,GPARs)的链接预测结果进行对比,验证了QGPARs与GPARs之间在链接预测结果方面存在显著差异,也进一步验证了QGPARs在链接预测中的有效性。
中图分类号:
[1]FAN W F.Graph pattern matching revised for social networkanalysis[C]//Proceedings of the 15th International Conference on Database Theory.Berlin:ACM,2012:8-21. [2]FAN W F,WANG X,WU Y H,et al.Association rules with graph patterns[J].Proceedings of the VLDB Endowment,2015,8(12):1502-1513. [3]BAPNA R,UMYAROVA.Do your online friends make youpay? A randomized field experiment on peer influence in online social networks[J].Management Science,2015,61(8):1902-1920. [4]Nielsen global online consumer survey[OL].https://www.nielsen.com/wp-content/uploads/sites/2/2019/04/pr_global-study_07709.pdf. [5]LIAQAT M,KHAN S,YOUNIS M S,et al.Applying uncertain frequent pattern mining to improve ranking of retrieved images[J].Applied Intelligence,2019,49(8):2982-3001. [6]SONG Q,WU Y,LIN P,et al.Mining summaries for knowledge graph search[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(10):1887-1900. [7]ELSEIDY M,ABDELHAMID E,SKIADOPOULOS S,et al.Grami:Frequent subgraph and pattern mining in a single large graph[J].Proceedings of the VLDB Endowment,2014,7(7):517-528. [8]TALUKDER N,ZAKI M J.A distributed approach for graphmining in massive networks[J].Data Mining and Knowledge Discovery,2016,30(5):1024-1052. [9]KAVITHA D,HARITHA D,PADMA Y.Optimized Candidate Generation for Frequent Subgraph Mining in a Single Graph[C]//Proceedings of International Conference on Computational Intelligence and Data Engineering.Springer,Singapore,2021:259-272. [10]LI L,DING P,CHEN H,et al.Frequent pattern mining in big social graphs[J].IEEE Transactions on Emerging Topics in Computational Intelligence,2021,6(3):638-648. [11]UR REHMAN S,LIU K,ALI T,et al.A graph mining ap-proach for ranking and discovering the interesting frequent subgraph patterns[J].International Journal of Computational Intelligence Systems,2021,14:1-17. [12]ASHRAF N,HAQUE R R,ISLAM M,et al.WeFreS:weighted frequent subgraph mining in a single large graph[C]//19th Industrial Conference Advances in Data Mining-Applications and Theoretical Aspects(ICDC).2019:201-215. [13]LE N T,VO B,NGUYEN L B Q,et al.Mining weighted subgraphs in a single large graph[J].Information Sciences,2020,514:149-165. [14]RAY A,HOLDER L,CHOUDHURY S.Frequent subgraphdiscovery in large attributed streaming graphs[C]//Proceedings of the 3rd International Workshop on big data,streams and he-terogeneous source mining:algorithms,systems,programming models and applications.New York:JMLR,2014:166-181. [15]ABDELHAMID E,CANIM M,SADOGHI M,et al.Incremental frequent subgraph mining on large evolving graphs[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(12):2710-2723. [16]BRY F,FURCHE T,MARNETTE B,et al.SPARQLog:SPARQL with rules and quantification[M].Berlin:Springer,2009:341-370. [17]AMER-YAHIA S,LAKSHMANAN L,YU C.Socialscope:Enabling information discovery on social content sites[C]//Fourth Biennial Conference on Innovative Data Systems Research.Asilomar,CA:CIDR,2009. [18]SAN MARTIN M,GUTIERREZ C,WOOD P T.SNQL:A social networks query and transformation language[C]//Procee-dings of the 5th Alberto Mendelzon International Workshop on Foundations of Data Management.Santiago:CEUR,2011. [19]LiN W Y,ALVAREZ S A,RUIZ C.Collaborative recommendation via adaptive association rule mining[J].Data Mining and Knowledge Discovery,2000,6(1):83-105. [20]MAHFOUD H.Expressive top-k matching for conditional graph patterns[J].Neural Computing and Applications,2021:1-17. [21]BLAU H,IMMERMAN N,JENSEN D.A visual language forquerying and updating graphs[J].University of Massachusetts Amherst Computer Science Technical Report,2002,37:2002. [22]FAN W F,WU Y H,XU J B.Adding counting quantifiers tograph patterns[C]//Proceedings of the 2016 International Conference on Management of Data.San Francisco:ACM,2016:1215-1230. [23]FAN W F.Graph pattern matching revised for social networkanalysis[C]//Proceedings of the 15th International Conference on Database Theory.Cambridge,MA:IEEE,2012:8-21. [24]QIAO F,ZHANG X,LI P,et al.A parallel approach for fre-quent subgraph mining in a single large graph using spark[J].Applied Sciences,2018,8(2):230. [25]SENTHILSELVAN N,SUBRAMANIYASWAMY V,VIJAY-AKUMAR V,Et al.Distributed frequent subgraph mining on evolving graph using SPARK[J].Intelligent Data Analysis,2020,24(3):495-513. [26]WANG X,XIANG M,ZHAN H,et al.Distributed Top-k Pattern Mining[C]//Asia-Pacific Web(APWeb) and Web-Age Information Management(WAIM) Joint International Conference on Web and Big Data.Cham:Springer,2021:203-220. [27]HUA G,ZHANG J,CUI L,et al.D-colSimulation:A Distribu-ted Approach for Frequent Graph Pattern Mining based on colSimulation in a Single Large Graph[C]//2020 IEEE International Conference on Services Computing(SCC).IEEE,2020:76-83. [28]SAHOO A,SENAPATI R.A Novel Approach for DistributedFrequent Pattern Mining Algorithm using Load-Matrix[C]//2021 International Conference on Intelligent Technologies(CONIT).IEEE,2021:1-5. [29]YAN D,QU W,GUO G,et al.PrefixFPM:a parallel framework for general-purpose frequent pattern mining[C]//2020 IEEE 36th International Conference on Data Engineering(ICDE).IEEE,2020:1938-1941. [30]FARHAN HUSAIN M,DOSHI P,KHAN L,et al.Storage and retrieval of large rdf graph using hadoop and mapreduce[C]//IEEE International Conference on Cloud Computing.Springer,Berlin,Heidelberg,2009:680-686. [31]BRINGMANN B,NIJSSEN S.What is frequent in a singlegraph?[C]//Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining.Berlin:Springer,2008:858-863. [32]WANG X,XU Y,ZHAN H Y.Extending association rules with graph patterns[J].Expert Systems with Applications,2020,141:112897. [33]RAHIMIAN F,PAYBERAH A H,GIRDZIJAUSKAS S,et al.Ja-be-ja:A distributed algorithm for balanced graph partitioning[C]//2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems.Philadelphia:IEEE,2013:51-60. [34]KUMAR A,SINGH S S,SINGH K,et al.Link prediction techniques,applications,and performance:A survey[J].Physica A:Statistical Mechanics and its Applications,2020,553:124289. [35]ROBERTO J,BAYARDO JR,AGRAWAL R.Mining the most interesting rules[C]//Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mi-ning.San Diego:ACM,1999:145-154. [36]VO B,BAC L E.Interestingness measures for association rules:combination between lattice and hash tables[J].Expert Systems With Applications,2011,38(9):11630-11640. |
|