基于指示词表征学习的半监督聚类方法

doi:10.11896/jsjkx.250600063

Abstract

Abstract: Current clustering methods enhance performance by jointly learning cluster-friendly representation spaces and cluster assignments.However,they remain fundamentally constrained by static embedding spaces primarily derived from pre-trained visual encoders,where cluster assignments rely on rigid metric systems(e.g.,Euclidean distance or cosine similarity) within the fixed feature space.Inspired by the stable training dynamics and conditional control capabilities of diffusion models,this paper proposes a novel semi-supervised clustering framework.Methodologically,it encodes cluster centers as learnable conditional embedding vectors and constructs a noise-prediction-error-driven generative metric function,transcending the traditional Euclidean linear separability constraints.A two-stage dynamic optimization strategy is designed,integrating supervised pre-training with semantic anchoring and unsupervised adaptation with matching losses to balance intra-cluster compactness and inter-class separabi-lity.Theoretically,based on Rademacher complexity and bounded noise-prediction assumptions,it derives an expected risk upper bound of $\mathcal{O}$(k／n) proving the asymptotic consistency of the proposed method on large-scale data and guaranteeing its generalization capability.Furthermore,it demonstrates that supervised information,through strong convexity constraints and Lipschitz continuity of the denoising network,accelerates the decay rate of the dominant error term to $\mathcal{O}$(1＼/nm_c) elucidating the compression effect of labeled data on hypothesis space complexity.Experimentally,the proposed framework achieves competitive results on benchmark datasets such as ImageNet-10,supported by ablation studies validating the efficacy of key components.

Key words: Semi-supervised learning, Prompt representation learning, Diffusion models, Generative metric methods, Clustering risk

CLC Number:

TP181

WANG Yiming, JIAO Min, ZHAO Suyun, CHEN Hong, LI Cuiping. Prompt-conditioned Representation Learning with Diffusion Models for Semi-supervised Clustering[J].Computer Science, 2026, 53(3): 158-165.

References

[1]MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.University of California press,1967:281-298.
[2]DASGUPTA S,LONG P M.Performance guarantees for hierarchical clustering[J].Journal of Computer and System Sciences,2005,70(4):555-569.
[3]LI Y,HU P,LIU Z,et al.Contrastive clustering[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:8547-8555.
[4]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.AAAI,1996:226-231.
[5]SINGH K K,OJHA U,LEE Y J.Finegan:Unsupervised hierarchical disentanglement for fine-grained object generation and discovery[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:6490-6499.
[6]KIM Y,HA J W.Contrastive fine-grained class clustering via generative adversarial networks[J].arXiv:2112.14971,2021.
[7]BRADLEY P S,BENNETT K P,DEMIRIZ A.Constrained K-Means Clustering[J/OL].https://www.researchgate.net/publication/2458036_Constrained_K-Means_Clustering.
[8]SUN B,ZHOU P,DU L,et al.Active deep image clustering[J].Knowledge-Based Systems,2022,252:109346.
[9]SOHN K,BERTHELOT D,CARLINI N,et al.Fixmatch:Simplifying semi-supervised learning with consistency and confidence[J].Advances in Neural Information Processing Systems,2020,33:596-608.
[10]HO J,JAIN A,ABBEEL P.Denoising diffusion probabilisticmodels[J].Advances in Neural Information Processing Systems,2020,33:6840-6851.
[11]DHARIWAL P,NICHOL A.Diffusion models beat gans on image synthesis[J].Advances in Neural Information Processing Systems,2021,34:8780-8794.
[12]DONG Z,WEI P,LIN L.Dreamartist:Towards controllableone-shot text-to-image generation via positive-negative prompt-tuning[J].arXiv:2211.11337,2022.
[13]RUIZ N,LI Y,JAMPANI V,et al.Dreambooth:Fine tuningtext-to-image diffusion models for subject-driven generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:22500-22510.
[14]LIU P,YUAN W,FU J,et al.Pre-train,prompt,and predict:A systematic survey of prompting methods in natural language processing[J].ACM Computing Surveys,2023,55(9):1-35.
[15]VAN GANSBEKE W,VANDENHENDE S,GEORGOULIS S,et al.Scan:Learning to classify images without labels[C]//European Conference on Computer Vision.Cham:Springer,2020:268-285.
[16]DANG Z,DENG C,YANG X,et al.Nearest neighbor matching for deep clustering[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2021:13693-13702.
[17]NIU C,SHAN H,WANG G.Spice:Semantic pseudo-labelingfor image clustering[J].IEEE Transactions on Image Proces-sing,2022,31:7264-7278.
[18]WANG Y,CHEN H,HENG Q,et al.Freematch:Self-adaptive thresholding for semi-supervised learning[J].arXiv:2205.07246,2022.
[19]SHEN Y,SHEN Z,WANG M,et al.You never cluster alone[J].Advances in Neural Information Processing Systems,2021,34:27734-27746.
[20]ZHONG H,WU J,CHEN C,et al.Graph contrastive clustering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:9224-9233.
[21]HUANG Z,CHEN J,ZHANG J,et al.Learning representation for clustering via prototype scattering and positive sampling[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(6):7509-7524.
[22]GRILL J B,STRUB F,ALTCHÉ F,et al.Bootstrap your own latent-a new approach to self-supervised learning[J].Advances in Neural Information Processing Systems,2020,33:21271-21284.
[23]OHI A Q,MRIDHA M F,SAFIR F B,et al.Autoembedder:A semi-supervised DNN embedding system for clustering[J].Knowledge-Based Systems,2020,204:106190.
[24]REN Y,HU K,DAI X,et al.Semi-supervised deep embedded clustering[J].Neurocomputing,2019,325:121-130.
[25]ŚMIEJA M,STRUSKI Ł,FIGUEIREDO M A T.A classification-based approach to semi-supervised clustering with pairwise constraints[J].Neural Networks,2020,127:193-203.
[26]BAI L,LIANG J Y,CAO F.Semi-supervised clustering withconstraints of different types from multiple information sources[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(9):3247-3258.
[27]MANDUCHI L,CHIN-CHEONG K,MICHEL H,et al.Deep conditional gaussian mixture model for constrained clustering[J].Advances in Neural Information Processing Systems,2021,34:11303-11314.
[28]REN P,XIAO Y,CHANG X,et al.A survey of deep activelearning[J].ACM Computing Surveys,2021,54(9):1-40.
[29]SUN B,ZHOU P,DU L,et al.Active deep image clustering[J].Knowledge-Based Systems,2022,252:109346.
[30]SONG J,MENG C,ERMON S.Denoising diffusion implicitmodels[J].arXiv:2010.02502,2020.
[31]MOKADY R,HERTZ A,ABERMAN K,et al.Null-text inversion for editing real images using guided diffusion models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:6038-6047.
[32]GAL R,ALALUF Y,ATZMON Y,et al.An image is worth oneword:Personalizing text-to-image generation using textual inversion[J].arXiv:2208.01618,2022.
[33]DONG Z,WEI P,LIN L.Dreamartist:Towards controllableone-shot text-to-image generation via positive-negative prompt-tuning[J].arXiv:2211.11337,2022.
[34]RUIZ N,LI Y,JAMPANI V,et al.Dreambooth:Fine tuningtext-to-image diffusion models for subject-driven generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:22500-22510.
[35]ZHUANG J,ZENG Y,LIU W,et al.A task is worth one word:Learning with task prompts for high-quality versatile image inpainting[C]//European Conference on Computer Vision.Cham:Springer,2024:195-211.
[36]VINCENT P,LAROCHELLE H,LAJOIE I,et al.Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion[J].Journal of Machine Learning Research,2010,11:3371-3408.
[37]CHANG J,WANG L,MENG G,et al.Deep adaptive imageclustering[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:5879-5887.
[38]HUANG D,CHEN D H,CHEN X,et al.Deepclue:Enhanced image clustering via multi-layer ensembles in deep neural networks[J].arXiv:2206.00359,2022.
[39]METAXAS I M,TZIMIROPOULOS G,PATRAS I.Divclust:Controlling diversity in deep clustering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:3418-3428.
[40]LIU Y.Refined learning bounds for kernel and approximatek-means[J].Advances in Neural Information Processing Systems,2021,34:6142-6154.

Related Articles 15

[1]	ZHAO Haihua, TANG Rui, MO Xian. Review of Methods and Applications of Graph Diffusion Models [J]. Computer Science, 2026, 53(3): 115-128.
[2]	GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[3]	CHEN Qirui, WANG Baohui, DAI Chencheng. Research on Electrocardiogram Classification and Recognition Algorithm Based on Transfer Learning [J]. Computer Science, 2025, 52(6A): 240900073-8.
[4]	DU Yuanhua, CHEN Pan, ZHOU Nan, SHI Kaibo, CHEN Eryang, ZHANG Yuanpeng. Correntropy Based Multi-view Low-rank Matrix Factorization and Constraint Graph Learning for Multi-view Data Clustering [J]. Computer Science, 2025, 52(6A): 240900131-10.
[5]	BAO Shenghong, YAO Youjian, LI Xiaoya, CHEN Wen. Integrated PU Learning Method PUEVD and Its Application in Software Source CodeVulnerability Detection [J]. Computer Science, 2025, 52(6A): 241100144-9.
[6]	WANG Xiao, LI Guanxiong, LI Na, YUAN Dongfeng. Semi-supervised Learning Flow Field Prediction Method Based on Gaussian Mixture Discrimination [J]. Computer Science, 2025, 52(6): 88-95.
[7]	WU You, WANG Jing, LI Peipei, HU Xuegang. Semi-supervised Partial Multi-label Feature Selection [J]. Computer Science, 2025, 52(4): 161-168.
[8]	SHEN Yaxin, GAO Lijian , MAO Qirong. Semi-supervised Sound Event Detection Based on Meta Learning [J]. Computer Science, 2025, 52(3): 222-230.
[9]	KANG Wei, LI Lihui, WEN Yimin. Semi-supervised Classification of Data Stream with Concept Drift Based on Clustering Model Reuse [J]. Computer Science, 2024, 51(4): 124-131.
[10]	GE Yinchi, ZHANG Hui, SUN Haohang. Differential Privacy Data Synthesis Method Based on Latent Diffusion Model [J]. Computer Science, 2024, 51(3): 30-38.
[11]	DAI Wei, CHAI Jing, LIU Yajiao. Semi-supervised Learning Algorithm Based on Maximum Margin and Manifold Hypothesis [J]. Computer Science, 2024, 51(2): 259-267.
[12]	YAN Zhihao, ZHOU Zhangbing, LI Xiaocui. Survey on Generative Diffusion Model [J]. Computer Science, 2024, 51(1): 273-283.
[13]	LI Hui, LI Wengen, GUAN Jihong. Dually Encoded Semi-supervised Anomaly Detection [J]. Computer Science, 2023, 50(7): 53-59.
[14]	GU Yuhang, HAO Jie, CHEN Bing. Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion [J]. Computer Science, 2023, 50(6A): 220500001-6.
[15]	WANG Qingyu, WANG Hairui, ZHU Guifu, MENG Shunjian. Study on SQL Injection Detection Based on FlexUDA Model [J]. Computer Science, 2023, 50(6A): 220600172-6.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Prompt-conditioned Representation Learning with Diffusion Models for Semi-supervised Clustering

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0