计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 100-108.doi: 10.11896/jsjkx.240700112

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于动态阈值伪标签筛选的深度图对比聚类算法

王沛, 杨希洪, 管仁祥, 祝恩   

  1. 国防科技大学计算机学院 长沙 410073
  • 收稿日期:2024-07-17 修回日期:2024-10-25 出版日期:2025-08-15 发布日期:2025-08-08
  • 通讯作者: 祝恩(enzhu@nudt.edu.cn)
  • 作者简介:(wangpei@nudt.edu.cn)
  • 基金资助:
    科技创新2030(2022ZD0209103)

Deep Graph Contrastive Clustering Algorithm Based on Dynamic Threshold Pseudo-label Selection

WANG Pei, YANG Xihong, GUAN Renxiang, ZHU En   

  1. College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China
  • Received:2024-07-17 Revised:2024-10-25 Online:2025-08-15 Published:2025-08-08
  • About author:WANG Pei,born in 2001,postgraduate.His main research interest is self supervised graph representation learning.
    ZHU En,born in 1976,professor,Ph.D supervisor,is a senior member of CCF(No.16689D).His main research in-terests include clustering,anomaly detection,computer vision,medical image analysis,etc.
  • Supported by:
    National Science and Technology Innovation 2030 Major Project(2022ZD0209103).

摘要: 近年来,图神经网络在处理复杂结构数据方面表现出色,被广泛应用于节点分类、图分类、链接预测等领域。深度图聚类结合了GNNs强大的表示能力与聚类算法的目标,从复杂的图结构数据中发现隐藏的簇结构。然而,现有的基于伪标签的图聚类算法在进行模型优化时常使用固定阈值,根据类别对样本进行筛选,以获得高置信度的样本数据来引导模型优化。但固定阈值的方法会导致类别不平衡问题,进而影响模型聚类的性能。为了解决上述问题,提出了一种基于动态阈值伪标签的深度图对比聚类算法。具体来说,采用两个不共享参数的多层感知机(MLP)结构捕捉图数据的潜在结构特征,并使用K-Means算法得到聚类结果。在此基础上,引入信赖强度来动态调整获得伪标签的阈值,在训练过程中动态调整每个类别中高置信度的样本数量,缓解类别不平衡的问题。此外,优化了对比学习策略,改进了样本对的构造方法,提高了模型的判别能力。实验结果表明,所提方法在6个基准数据集上均表现出色,在多个评估指标上超越了现有方法,展现了其有效性。

关键词: 深度图聚类, 伪标签, 图对比聚类, 图神经网络, 动态阈值

Abstract: In recent years,graph neural networks have performed well in processing complex structural data,and are widely used in node classification,graph classification,link prediction and other fields.Deep graph clustering combines the powerful representation ability of GNNs with the goal of clustering algorithms to discover hidden population structures from complex graph structure data.However,the existing pseudo-label-based graph clustering algorithms often use fixed thresholds to filter samples according to categories to obtain high-confidence sample data to guide model optimization.However,the method of fixed thresholds can lead to category imbalance,which in turn affects the performance of model clustering.In order to solve the above problems,this paper proposes a contrastive clustering algorithm based on dynamic threshold pseudo-label depth map.Specifically,two multilayer perceptron(MLP) structures that do not share parameters are used to capture the latent structural features of the graph data,and the K-Means algorithm is used to obtain the clustering results.On this basis,the trust strength is introduced to dynamically adjust the threshold for obtaining pseudo-labels,and the number of high-confidence samples in each category is dynamically adjusted during the training process to alleviate the problem of category imbalance.In addition,this paper optimizes the contrastive learning strategy,improves the construction method of sample pairs,and improves the discriminant ability of the model.Experimental results show that the proposed method performs well on the six benchmark datasets,surpassing the existing methods in multiple evaluation indicators,and strongly demonstrates the effectiveness of the proposed algorithm.

Key words: Deep graph clustering, Pseudo-label, Graph contrastive clustering, Graph meural network, Dynamic threshold

中图分类号: 

  • TP391
[1]WU Z,PAN S,CHEN F,et al.A comprehensive survey ongraph neural networks[J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(1):4-24.
[2]GUPTA A,MATTA P,PANT B.Graph neural network:Current state of Art,challenges and applications[J].Materials Today:Proceedings,2021,46:10927-10932.
[3]TSITSULIN A,PALOWITCH J,PEROZZI B,et al.Graph clustering with graph neural networks[J].Journal of Machine Learning Research,2023,24(127):1-21.
[4]TU W,GUAN R,ZHOU S,et al.Attribute-missing graph clustering network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:15392-15401.
[5]DIKE H U,ZHOU Y,DEVEERASETTY K K,et al.Unsupervised learning based on artificial neural network:A review[C]//2018 IEEE International Conference on Cyborg and Bionic Systems(CBS).IEEE,2018:322-327.
[6]LIU Y,TU W,ZHOU S,et al.Deep graph clustering via dual correlation reduction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:7603-7611.
[7]LI J,GUAN R,HAN Y,et al.Superpixel-Based Dual-Neighborhood Contrastive Graph Autoencoder for Deep Subspace Clustering of Hyperspectral Image[C]//International Conference on Intelligent Computing.Springer,2024:181-192.
[8]TSITSULIN A,PALOWITCH J,PEROZZI B,et al.Graph clustering with graph neural networks[J].Journal of Machine Learning Research,2023,24(127):1-21.
[9]WANG C,PAN S,HU R,et al.Attributed graph clustering:a deep attentional embedding approach[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.2019:3670-3676.
[10]XIA W,WANG Q,GAO Q,et al.Self-consistent contrastive attributed graph clustering with pseudo-label prompt[J].IEEE Transactions on Multimedia,2022,25:6665-6677.
[11]WANG X,WU Z,LIAN L,et al.Debiased learning from naturally imbalanced pseudo-labels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:14647-14657.
[12]ARAZO E,ORTEGO D,ALBERT P,et al.Pseudo-labeling and confirmation bias in deep semi-supervised learning[C]//2020 International Joint Conference on Neural Networks(IJCNN).IEEE,2020:1-8.
[13]GUAN R,LI Z,LI X,et al.Pixel-superpixel contrastive learning and pseudo-label correction for hyperspectral image clustering[C]//ICASSP 2024-2024 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2024:6795-6799.
[14]YANG X,LIU Y,ZHOU S,et al.Cluster-guided ContrastiveGraph Clustering Network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:10834-10842.
[15]XU D,CHENG W,LUO D,et al.Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs[C]//IJCAI.2019:3947-3953.
[16]KIPF T N,WELLING M.Variational Graph Auto-Encoders[J].Stat,2016,1050:21.
[17]WANG C,PAN S,HU R,et al.Attributed Graph Clustering:A Deep Attentional Embedding Approach[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.2019:3670-3676.
[18]VELICKOVIC P,FEDUS W,HAMILTON W L,et al.DeepGraph Infomax[J].Stat,2018,1050:21.
[19]ZHU Y,XU Y,YU F,et al.Deep graph contrastive representation learning[J].arXiv:2006.04131,2020.
[20]QIU J,CHEN Q,DONG Y,et al.Gcc:Graph contrastive coding for graph neural network pre-training[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:1150-1160.
[21]WU Z,XIONG Y,YU S X,et al.Unsupervised feature learning via non-parametric instance discrimination[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3733-3742.
[22]CHEN X,FAN H,GIRSHICK R,et al.Improved baselines with momentum contrastive learning[J].arXiv:2003.04297,2020.
[23]YOU Y,CHEN T,SUI Y,et al.Graph contrastive learning with augmentations[J].Advances in Neural Information Processing Systems,2020,33:5812-5823.
[24]GUAN R,LI Z,TU W,et al.Contrastive multiview subspace clustering of hyperspectral images based on graph convolutional networks[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62:1-14.
[25]YANG X,TAN C,LIU Y,et al.Convert:Contrastive graphclustering with reliable augmentation[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:319-327.
[26]CUI G,ZHOU J,YANG C,et al.Adaptive graph encoder for attributed graph embedding[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:976-985.
[27]LIU Y,TU W,ZHOU S,et al.Deep Graph Clustering via Dual Correlation Reduction[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.2022:7603-7611.
[28]YANG X,LIU Y,ZHOU S,et al.Cluster-guided ContrastiveGraph Clustering Network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:10834-10842.
[29]HASSANI K,KHASAHMADI A H.Contrastive Multi-view Representation Learning on Graphs[C]//International Confe-rence on Machine Learning.PMLR,2020:4116-4126.
[30]XIE J,GIRSHICK R,FARHADI A.Unsupervised Deep Embedding for Clustering Analysis[C]//International Conference on Machine Learning.PMLR,2016:478-487.
[31]YANG B,FU X,SIDIROPOULOS N D,et al.Towards K-means-friendly Spaces:Simultaneous Deep Learning and Clustering[C]//International Conference on Machine Learning.PMLR,2017:3861-3870.
[32]WANG C,PAN S,LONG G,et al.Mgae:Marginalized Graph Autoencoder for Graph Clustering[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:889-898.
[33]PAN S,HU R,FUNG S,et al.Learning Graph Embedding with Adversarial Training Methods[J].IEEE Transactions on Cybernetics,2019,50(6):2475-2487.
[34]LI X,ZHANG H,ZHANG R.Adaptive Graph Auto-encoder for General Data Clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(12):9725-9732.
[35]ZHU Y,XU Y,YU F,et al.Graph Contrastive Learning with Adaptive Augmentation[C]//Proceedings of the Web Confe-rence 2021.2021:2069-2080.
[36]JIN W,LIU X,ZHAO X,et al.Automated Self-SupervisedLearning for Graphs[J].arXiv:2106.05470,2021.
[37]LI X,WU W,ZHANG B,et al.Multi-scale Graph Clustering Network[J].Information Sciences,2024,678:121023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!