Computer Science ›› 2025, Vol. 52 ›› Issue (2): 80-90.doi: 10.11896/jsjkx.240200005

• Database & Big Data & Data Science • Previous Articles     Next Articles

Graph Anomaly Detection Model Based on Personalized PageRank and Contrastive Learning

YUAN Ye1, CHEN Ming2, WU Anbiao2, WANG Yishu2   

  1. 1 School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China
    2 College of Computer Science and Engineering,Northeastern University,Liaoning 110169,China
  • Received:2024-02-01 Revised:2024-06-30 Online:2025-02-15 Published:2025-02-17
  • About author:YUAN Ye,born in 1981,Ph.D,professor,Ph.D supervisor.His main research interests include big data management and analytics,artificial intelligence based on big data and distributed big data computing.
  • Supported by:
    National Key R&D Program of China(2022YFB2702100),National Natural Science Foundation of China(61932004,62225203,U21A20516,62302084) and China Postdoctoral Science Foundation(2023M730518).

Abstract: Graph anomaly detection aims to detect abnormal nodes from attribute networks,and is highly valued by researchers due to its profound practical significance in many application fields such as finance,electronic trade,and spam sender detection.Traditional non deep learning methods can only capture the shallow structure of the graph,and researchers have proposed anomaly detection models based on deep neural networks to address this issue.However,these models do not take into account the centrality differences of nodes in the graph,which can lead to information loss or introduce noise from remote nodes when capturing local information of nodes.In addition,they ignore the feature information in the attribute space,which can provide additional anomaly monitoring signals.Therefore,this paper proposes a novel graph anomaly detection framework PC-GAD(personalized PageRank and contrastive learning based graph anomaly detection) from an unsupervised perspective.Firstly,a dynamic sampling strategy is proposed,which calculates the personalized PageRank vector of each node in the graph to determine the corresponding size of subgraph samples,avoiding the loss of local information and noise introduction.Secondly,for each node,the abnormal supervision signals are captured from the perspective of topology and attribute space,and the corresponding contrastive learning objective is designed to comprehensively learn potential abnormal patterns.Finally,after multiple rounds of contrast and prediction,the degree of abnormality of each node is evaluated according to the score of the output outlier.To verify the effectiveness of the proposed model,a large number of comparative experiments are conducted with the benchmark models on six real datasets.Experimental results have verified that PC-GAD can comprehensively identify abnormal nodes in the graph,and the AUC value increases by 1.42% compared to existing models.

Key words: Graph anomaly detection, Personalized PageRank, Graph neural network, Graph contrastive learning

CLC Number: 

  • TP391
[1]YUAN Y,CHEN L,WANG G.Efficiently answering probability threshold-based shortest path queries over uncertain graphs[C]//Database Systems for Advanced Applications:15th International Conference.2010:155-170.
[2]YUAN Y,WANG G,WANG H,et al.Efficient subgraph search over large uncertain graphs[J].Proceedings of the VLDB Endowment,2011,4(11):876-886.
[3]YUAN Y,WANG G,CHEN L,et al.Efficient subgraph simila-rity search on large probabilistic graph databases[J].Proceedings of the VLDB Endowment,2012,5(9):800-811.
[4]YUAN Y,WANG G,CHEN L,et al.Efficient keyword search on uncertain graph data[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(12):2767-2779.
[5]HU B,ZHANG Z,SHI C,et al.Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:946-953.
[6]WU Y,LIAN D,XU Y,et al.Graph convolutional networkswith markov random field reasoning for social spammer detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:1054-1061.
[7]LI A,QIN Z,LIU R,et al.Spam review detection with graphconvolutional networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:2703-2711.
[8]MA X,WU J,XUE S,et al.A comprehensive survey on graphanomaly detection with deep learning[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(12):12012-12038.
[9]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[10]PEROZZI B,AKOGLU L.Scalable anomaly ranking of attributed neighborhoods[C]//Proceedings of the 2016 SIAM International Conference on Data Mining,Society for Industrial and Applied Mathematics.2016:207-215.
[11]LIU N,HUANG X,HU X.Accelerated Local Anomaly Detection via Resolving Attributed Networks[C]//IJCAI.2017:2337-2343.
[12]LI J,DANI H,HU X,et al.Radar:Residual Analysis for Ano-maly Detection in Attributed Networks[C]//IJCAI.2017:2152-2158.
[13]PENG Z,LUO M,LI J,et al.ANOMALOUS:A Joint Modeling Approach for Anomaly Detection on Attributed Networks[C]//IJCAI,2018:3513-3519.
[14]DING K,LI J,BHANUSHALI R,et al.Deep anomaly detection on attributed networks[C]//Proceedings of the 2019 SIAM International Conference on Data Mining,Society for Industrial and Applied Mathematics.2019:594-602.
[15]FAN H,ZHANG F,LI Z.Anomalydae:Dual autoencoder foranomaly detection on attributed networks[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).2020:5685-5689.
[16]BANDYOPADHYAY S,VIVEK S V,MURTY M N.Outlierresistant unsupervised deep architectures for attributed network embedding[C]//Proceedings of the 13th International Confe-rence on Web Search and Data Mining.2020:25-33.
[17]PEI Y,HUANG T,VAN IPENBURG W,et al.ResGCN:attention-based deep residual modeling for anomaly detection on attributed networks[C]//2021 IEEE 8th International Conference on Data Science and Advanced Analytics(DSAA).2021:1-2.
[18]LIU Y,LI Z,PAN S,et al.Anomaly detection on attributed networks via contrastive self-supervised learning[J].IEEE Tran-sactions on Neural Networks and Learning Systems,2021,33(6):2378-2392.
[19]DUAN J,WANG S,ZHANG P,et al.Graph anomaly detection via multi-scale contrastive learning networks with augmented view[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:7459-7467.
[20]VELICKOVIC P,FEDUS W,HAMILTON W L,et al.Deepgraph infomax[J].ICLR(Poster),2019,2(3):4.
[21]HASSANI K,KHASAHMADI A H.Contrastive multi-viewrepresentation learning on graphs[C]//International Confe-rence on Machine Learning.2020:4116-4126.
[22]YOU Y,CHEN T,SUI Y,et al.Graph contrastive learning with augmentations[J].Advances in Neural Information Processing Systems,2020,33:5812-5823.
[23]ZHU Y,XU Y,YU F,et al.Graph contrastive learning withadaptive augmentation[C]//Proceedings of the Web Conference 2021.2021:2069-2080.
[24]LI D,WANG W,SHAO M,et al.Contrastive Representation Learning Based on Multiple Node-centered Subgraphs[C]//Proceedings of the 32nd ACM International Conference on Information and Knowledge Management.2023:1338-1347.
[25]PAGE L,BRIN S,MOTWANI R,et al.The PageRank citation ranking:Bringing order to the web[R].Stanford Infolab,1999.
[26]GASTEIGER J,BOJCHEVSKI A,GÜNNEMANN S.Predictthen propagate:Graph neural networks meet personalized page-rank[J].arXiv:1810.05997,2018.
[27]BOJCHEVSKI A,GASTEIGER J,PEROZZI B,et al.Scalinggraph neural networks with approximate pagerank[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:2464-2473.
[28]ZHANG M L,ZHOU Z H.ML-KNN:A lazy learning approach to multi-label learning[J].Pattern recognition,2007,40(7):2038-2048.
[29]ZHANG S,LI X,ZONG M,et al.Efficient kNN classification with different numbers of nearest neighbors[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(5):1774-1785.
[30]DENG Z,ZHU X,CHENG D,et al.Efficient kNN classification algorithm for big data[J].Neurocomputing,2016,195:143-148.
[31]ANDERSEN R,CHUNG F,LANG K.Local graph partitioning using pagerank vectors[C]//2006 47th Annual IEEE Sympo-sium on Foundations of Computer Science(FOCS'06).2006:475-486.
[32]TONG H,FALOUTSOS C,PAN J Y.Fast random walk with restart and its applications[C]//Sixth International Conference on Data Mining(ICDM'06).2006:613-622.
[33]LIU Y,JIN M,PAN S,et al.Graph self-supervised learning:A survey[J].IEEE Transactions on Knowledge and Data Engineering,2022,35(6):5879-5900.
[34]OORD A,LI Y,VINYALS O.Representation learning with contrastive predictive coding[J].arXiv:1807.03748,2018.
[35]LI S,AMENTA N.Brute-force k-nearest neighbors search on the GPU[C]//Similarity Search and Applications:8th International Conference.2015:259-270.
[36]YANG Z,COHEN W,SALAKHUDINOV R.Revisiting semi-supervised learning with graph embeddings[C]//International Conference on Machine Learning.2016:40-48.
[37]GILES C L,BOLLACKER K D,LAWRENCE S.CiteSeer:An automatic citation indexing system[C]//Proceedings of the Third ACM Conference on Digital Libraries.1998:89-98.
[38]CANESE K,WEIS S.PubMed:the bibliographic database[J].The NCBI handbook,2013,2(1):1-9.
[39]TANG J,ZHANG J,YAO L,et al.Arnetminer:extraction and mining of academic social networks[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:990-998.
[40]TANG L,LIU H.Relational learning via latent social dimen-sions[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2009:817-826.
[41]LIU F T,TING K M,ZHOU Z H.Isolation Forest[C]//Proceedings of the 2008 Eighth IEEE International Conference on Data Mining.2008:413-422.
[1] ZHENG Chuangrui, DENG Xiuqin, CHEN Lei. Traffic Prediction Model Based on Decoupled Adaptive Dynamic Graph Convolution [J]. Computer Science, 2025, 52(6A): 240400149-8.
[2] TENG Minjun, SUN Tengzhong, LI Yanchen, CHEN Yuan, SONG Mofei. Internet Application User Profiling Analysis Based on Selection State Space Graph Neural Network [J]. Computer Science, 2025, 52(6A): 240900060-8.
[3] SHI Enyi, CHANG Shuyu, CHEN Kejia, ZHANG Yang, HUANG Haiping. BiGCN-TL:Bipartite Graph Convolutional Neural Network Transformer Localization Model for Software Bug Partial Localization Scenarios [J]. Computer Science, 2025, 52(6A): 250200086-11.
[4] QIAO Yu, XU Tao, ZHANG Ya, WEN Fengpeng, LI Qiangwei. Graph Neural Network Defect Prediction Method Combined with Developer Dependencies [J]. Computer Science, 2025, 52(6): 52-57.
[5] WANG Jinghong, WU Zhibing, WANG Xizhao, LI Haokang. Semantic-aware Heterogeneous Graph Attention Network Based on Multi-view RepresentationLearning [J]. Computer Science, 2025, 52(6): 167-178.
[6] WU Pengyuan, FANG Wei. Study on Graph Collaborative Filtering Model Based on FeatureNet Contrastive Learning [J]. Computer Science, 2025, 52(5): 139-148.
[7] HUANG Qian, SU Xinkai, LI Chang, WU Yirui. Hypergraph Convolutional Network with Multi-perspective Topology Refinement forSkeleton-based Action Recognition [J]. Computer Science, 2025, 52(5): 220-226.
[8] YANG Yingxiu, CHEN Hongmei, ZHOU Lihua , XIAO Qing. Heterogeneous Graph Attention Network Based on Data Augmentation [J]. Computer Science, 2025, 52(3): 180-187.
[9] LI Shao, JIANG Fangting, YANG Xinyan, LIANG Gang. Rumor Detection on Potential Hot Topics with Bi-directional Graph Attention Network [J]. Computer Science, 2025, 52(3): 277-286.
[10] ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[11] HU Haifeng, ZHU Yiwen, ZHAO Haitao. Network Slicing End-to-end Latency Prediction Based on Heterogeneous Graph Neural Network [J]. Computer Science, 2025, 52(3): 349-358.
[12] ZHENG Wenping, HAN Yiheng, LIU Meilin. Node Classification Algorithm Fusing High-order Group Structure Information [J]. Computer Science, 2025, 52(2): 107-115.
[13] WANG Xin, XIONG Shubo, SUN Lingyun. Federated Graph Learning:Problems,Methods and Challenges [J]. Computer Science, 2025, 52(1): 362-373.
[14] TANG Ying, WANG Baohui. Study on SSL/TLS Encrypted Malicious Traffic Detection Algorithm Based on Graph Neural Networks [J]. Computer Science, 2024, 51(9): 365-370.
[15] CHEN Liang, SUN Cong. Deep-learning Based DKOM Attack Detection for Linux System [J]. Computer Science, 2024, 51(9): 383-392.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!