Computer Science ›› 2023, Vol. 50 ›› Issue (11A): 230100041-12.doi: 10.11896/jsjkx.230100041

• Big Data & Data Science • Previous Articles     Next Articles

Mining and Application of Frequent Patterns with Counting Quantifiers

SHA Yuji, WANG Xin, HE Yanxiao, ZHONG Xueyan, FANG Yu   

  1. School of Computer Science,Southwest Petroleum University,Chengdu 610500,China
  • Published:2023-11-09
  • About author:SHA Yuji,born in 1993,postgraduate.His main research interests include data mining and machine learning.
    WANG Xin,born in 1981,Ph.D,professor,Ph.D supervisor,is a member ofChina Computer Federation.His main research interests include knowledge discovery in database,artificial intelligence,machine learning and data mining.

Abstract: Frequent pattern mining(FPM) is a classical problem in graph theory,more attention has been paid on FPM on single large graphs,which is defined as discovering all the pattern graphs Q with occurrence frequencies above a user defined threshold,in a single large graph G.In recent years,people have witnessed wide applications of FPM,such as social network analysis and fraud detection.However,emerging applications keep calling for more expressive pattern graphs along with their mining techniques to capture more complex structures in a large graph.In light of this,we incorporate counting quantifiers in pattern graphs and introduce quantified pattern graphs(QGPs) which are able to express richer semantics.We then develop a distributive algorithm to mine QGPs in parallel.Furthermore,we introduce quantified graph pattern association rules(QGPARs) for linking prediction on large graphs.We conduct experimental studies to validate the computational efficiency of the QGPs mining algorithm by using real-world and synthetic graph data.By comparing with prior link prediction methods,we find that prediction with QGPARs achieves even higher accuracy.Finally,by comparing with the link prediction results of traditional graph pattern association rules,we verify that there is a significant difference between QGPARs and GPARs in terms of link prediction results,and further verify the effectiveness of QGPARs in link prediction.

Key words: Quantified pattern graph, Frequent pattern mining, Distribute mining, Quantified graph pattern association rules, Link prediction

CLC Number: 

  • TP311
[1]FAN W F.Graph pattern matching revised for social networkanalysis[C]//Proceedings of the 15th International Conference on Database Theory.Berlin:ACM,2012:8-21.
[2]FAN W F,WANG X,WU Y H,et al.Association rules with graph patterns[J].Proceedings of the VLDB Endowment,2015,8(12):1502-1513.
[3]BAPNA R,UMYAROVA.Do your online friends make youpay? A randomized field experiment on peer influence in online social networks[J].Management Science,2015,61(8):1902-1920.
[4]Nielsen global online consumer survey[OL].https://www.nielsen.com/wp-content/uploads/sites/2/2019/04/pr_global-study_07709.pdf.
[5]LIAQAT M,KHAN S,YOUNIS M S,et al.Applying uncertain frequent pattern mining to improve ranking of retrieved images[J].Applied Intelligence,2019,49(8):2982-3001.
[6]SONG Q,WU Y,LIN P,et al.Mining summaries for knowledge graph search[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(10):1887-1900.
[7]ELSEIDY M,ABDELHAMID E,SKIADOPOULOS S,et al.Grami:Frequent subgraph and pattern mining in a single large graph[J].Proceedings of the VLDB Endowment,2014,7(7):517-528.
[8]TALUKDER N,ZAKI M J.A distributed approach for graphmining in massive networks[J].Data Mining and Knowledge Discovery,2016,30(5):1024-1052.
[9]KAVITHA D,HARITHA D,PADMA Y.Optimized Candidate Generation for Frequent Subgraph Mining in a Single Graph[C]//Proceedings of International Conference on Computational Intelligence and Data Engineering.Springer,Singapore,2021:259-272.
[10]LI L,DING P,CHEN H,et al.Frequent pattern mining in big social graphs[J].IEEE Transactions on Emerging Topics in Computational Intelligence,2021,6(3):638-648.
[11]UR REHMAN S,LIU K,ALI T,et al.A graph mining ap-proach for ranking and discovering the interesting frequent subgraph patterns[J].International Journal of Computational Intelligence Systems,2021,14:1-17.
[12]ASHRAF N,HAQUE R R,ISLAM M,et al.WeFreS:weighted frequent subgraph mining in a single large graph[C]//19th Industrial Conference Advances in Data Mining-Applications and Theoretical Aspects(ICDC).2019:201-215.
[13]LE N T,VO B,NGUYEN L B Q,et al.Mining weighted subgraphs in a single large graph[J].Information Sciences,2020,514:149-165.
[14]RAY A,HOLDER L,CHOUDHURY S.Frequent subgraphdiscovery in large attributed streaming graphs[C]//Proceedings of the 3rd International Workshop on big data,streams and he-terogeneous source mining:algorithms,systems,programming models and applications.New York:JMLR,2014:166-181.
[15]ABDELHAMID E,CANIM M,SADOGHI M,et al.Incremental frequent subgraph mining on large evolving graphs[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(12):2710-2723.
[16]BRY F,FURCHE T,MARNETTE B,et al.SPARQLog:SPARQL with rules and quantification[M].Berlin:Springer,2009:341-370.
[17]AMER-YAHIA S,LAKSHMANAN L,YU C.Socialscope:Enabling information discovery on social content sites[C]//Fourth Biennial Conference on Innovative Data Systems Research.Asilomar,CA:CIDR,2009.
[18]SAN MARTIN M,GUTIERREZ C,WOOD P T.SNQL:A social networks query and transformation language[C]//Procee-dings of the 5th Alberto Mendelzon International Workshop on Foundations of Data Management.Santiago:CEUR,2011.
[19]LiN W Y,ALVAREZ S A,RUIZ C.Collaborative recommendation via adaptive association rule mining[J].Data Mining and Knowledge Discovery,2000,6(1):83-105.
[20]MAHFOUD H.Expressive top-k matching for conditional graph patterns[J].Neural Computing and Applications,2021:1-17.
[21]BLAU H,IMMERMAN N,JENSEN D.A visual language forquerying and updating graphs[J].University of Massachusetts Amherst Computer Science Technical Report,2002,37:2002.
[22]FAN W F,WU Y H,XU J B.Adding counting quantifiers tograph patterns[C]//Proceedings of the 2016 International Conference on Management of Data.San Francisco:ACM,2016:1215-1230.
[23]FAN W F.Graph pattern matching revised for social networkanalysis[C]//Proceedings of the 15th International Conference on Database Theory.Cambridge,MA:IEEE,2012:8-21.
[24]QIAO F,ZHANG X,LI P,et al.A parallel approach for fre-quent subgraph mining in a single large graph using spark[J].Applied Sciences,2018,8(2):230.
[25]SENTHILSELVAN N,SUBRAMANIYASWAMY V,VIJAY-AKUMAR V,Et al.Distributed frequent subgraph mining on evolving graph using SPARK[J].Intelligent Data Analysis,2020,24(3):495-513.
[26]WANG X,XIANG M,ZHAN H,et al.Distributed Top-k Pattern Mining[C]//Asia-Pacific Web(APWeb) and Web-Age Information Management(WAIM) Joint International Conference on Web and Big Data.Cham:Springer,2021:203-220.
[27]HUA G,ZHANG J,CUI L,et al.D-colSimulation:A Distribu-ted Approach for Frequent Graph Pattern Mining based on colSimulation in a Single Large Graph[C]//2020 IEEE International Conference on Services Computing(SCC).IEEE,2020:76-83.
[28]SAHOO A,SENAPATI R.A Novel Approach for DistributedFrequent Pattern Mining Algorithm using Load-Matrix[C]//2021 International Conference on Intelligent Technologies(CONIT).IEEE,2021:1-5.
[29]YAN D,QU W,GUO G,et al.PrefixFPM:a parallel framework for general-purpose frequent pattern mining[C]//2020 IEEE 36th International Conference on Data Engineering(ICDE).IEEE,2020:1938-1941.
[30]FARHAN HUSAIN M,DOSHI P,KHAN L,et al.Storage and retrieval of large rdf graph using hadoop and mapreduce[C]//IEEE International Conference on Cloud Computing.Springer,Berlin,Heidelberg,2009:680-686.
[31]BRINGMANN B,NIJSSEN S.What is frequent in a singlegraph?[C]//Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining.Berlin:Springer,2008:858-863.
[32]WANG X,XU Y,ZHAN H Y.Extending association rules with graph patterns[J].Expert Systems with Applications,2020,141:112897.
[33]RAHIMIAN F,PAYBERAH A H,GIRDZIJAUSKAS S,et al.Ja-be-ja:A distributed algorithm for balanced graph partitioning[C]//2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems.Philadelphia:IEEE,2013:51-60.
[34]KUMAR A,SINGH S S,SINGH K,et al.Link prediction techniques,applications,and performance:A survey[J].Physica A:Statistical Mechanics and its Applications,2020,553:124289.
[35]ROBERTO J,BAYARDO JR,AGRAWAL R.Mining the most interesting rules[C]//Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mi-ning.San Diego:ACM,1999:145-154.
[36]VO B,BAC L E.Interestingness measures for association rules:combination between lattice and hash tables[J].Expert Systems With Applications,2011,38(9):11630-11640.
[1] TANG Shaosai, SHEN Derong, KOU Yue, NIE Tiezheng. Link Prediction Model on Temporal Knowledge Graph Based on Bidirectionally Aggregating Neighborhoods and Global Aware [J]. Computer Science, 2023, 50(8): 177-183.
[2] JIANG Linpu, CHEN Kejia. Self-supervised Dynamic Graph Representation Learning Approach Based on Contrastive Prediction [J]. Computer Science, 2023, 50(7): 207-212.
[3] LI Shujing, HUANG Zengfeng. Mixed-curve for Link Completion of Multi-relational Heterogeneous Knowledge Graphs [J]. Computer Science, 2023, 50(4): 172-180.
[4] WANG Jingbin, LAI Xiaolian, LIN Xinyu, YANG Xinyi. Context-aware Temporal Knowledge Graph Completion Based on Relation Constraints [J]. Computer Science, 2023, 50(3): 23-33.
[5] WANG Jiaqi, LI Wengen, GUAN Jihong, XING Ting, WEI Xiaomin, SHAO Bingqing, FU Chongjie. Knowledge Enhanced Relationship Prediction Model for Enterprise Entities [J]. Computer Science, 2023, 50(10): 146-155.
[6] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[7] HUANG Li, ZHU Yan, LI Chun-ping. Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(9): 76-82.
[8] LI Yong, WU Jing-peng, ZHANG Zhong-ying, ZHANG Qiang. Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism [J]. Computer Science, 2022, 49(4): 43-48.
[9] ZHAO Xue-lei, JI Xin-sheng, LIU Shu-xin, LI Ying-le, LI Hai-tao. Link Prediction Method for Directed Networks Based on Path Connection Strength [J]. Computer Science, 2022, 49(2): 216-222.
[10] WU Cheng-feng, CAI Li, LI Jin, LIANG Yu. Frequent Pattern Mining of Residents’ Travel Based on Multi-source Location Data [J]. Computer Science, 2021, 48(7): 155-163.
[11] HU Xin-tong, SHA Chao-feng, LIU Yan-jun. Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis [J]. Computer Science, 2021, 48(5): 124-129.
[12] CHEN Heng, WANG Wei-mei, LI Guan-yu, SHI Yi-ming. Knowledge Graph Completion Model Using Quaternion as Relational Rotation [J]. Computer Science, 2021, 48(5): 225-231.
[13] GONG Zhui-fei, WEI Chuan-jia. Link Prediction of Complex Network Based on Improved AdaBoost Algorithm [J]. Computer Science, 2021, 48(3): 158-162.
[14] LI Xin-chao, LI Pei-feng, ZHU Qiao-ming. Directed Network Representation Method Based on Hierarchical Structure Information [J]. Computer Science, 2021, 48(2): 100-104.
[15] GONG Zhui-fei, WEI Chuan-jia. Complex Network Link Prediction Method Based on Topology Similarity and XGBoost [J]. Computer Science, 2021, 48(12): 226-230.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!