计算机科学 ›› 2023, Vol. 50 ›› Issue (9): 123-129.doi: 10.11896/jsjkx.220700288

• 数据库&大数据&数据科学 • 上一篇    下一篇

一种结构关系一致的对比聚类方法

许洁, 王立松   

  1. 南京航空航天大学计算机科学与技术学院/人工智能学院/软件学院 南京 211106
  • 收稿日期:2022-07-29 修回日期:2022-12-05 出版日期:2023-09-15 发布日期:2023-09-01
  • 通讯作者: 王立松(wangls@nuaa.edu.cn)
  • 作者简介:(xujie85@nuaa.edu.cn)
  • 基金资助:
    基础加强计划重点项目(2019JCJQZD33800)

Contrastive Clustering with Consistent Structural Relations

XU Jie, WANG Lisong   

  1. College of Computer Science and Technology/College of Artificial Intelligence/College of Software,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2022-07-29 Revised:2022-12-05 Online:2023-09-15 Published:2023-09-01
  • About author:XU Jie,born in 1998,master.Her main research interest is image clustering and retrieval.
    WANG Lisong,born in 1969,Ph.D,professor,is a member of China Computer Federation.His main research interests include natural language processing and formal method.
  • Supported by:
    Key Projects of Foundation Strengthening Plan(2019JCJQZD33800).

摘要: 作为一项基本的无监督学习任务,聚类旨在将无标签的、混杂的图像数据划分成语义相似的类。最近的一些方法通过引入数据增强,利用对比学习方法学习特征表示和聚类分配,关注模型区分不同语义类的能力,可能导致来自同一语义类样本的特征嵌入被分离的情况。针对以上问题,提出一种结构关系一致的对比聚类方法(Contrastive Clustering with Consistent Structural Relations,CCR),在实例级和聚类级执行对比学习,并且增加关系级别的一致性约束,让模型学习更多来自结构关系的“正数据对”信息,从而减小聚类嵌入被分离所带来的影响。实验结果表明,CCR方法在图像基准数据集上得到了比近年来的无监督聚类方法更优异的结果。模型在CIFAR-10和STL-10数据集上的平均准确度比相同实验设置下的最好方法提升了1.7%,在CIFAR-100数据集上提升了1.9%。

关键词: 无监督学习, 聚类, 对比学习, 数据增强, 过度聚类

Abstract: As a basic unsupervised learning task,clustering aims to divide unlabeled and mixed images into semantically similar classes.Some recent approaches focus on the ability of the model to discriminate between different semantic classes by introducing data augmentation,using contrastive learning methods to learn feature representations and cluster assignments,which may lead to situations that feature embeddings from samples with the same semantic class are separated.Aiming at the above problems,a comparative clustering method with consistent structural relations(CCR) is proposed,which performs comparative learning at the instance level and cluster level,and adds consistency constraints at the relationship level.So that the model can learn more information of ‘positive data pair' and reduce the impact of cluster embedding being separated.Experimental results show that CCR obtains better results than the unsupervised clustering methods in recent years on the image benchmark dataset.The average accuracy on the CIFAR-10 and STL-10 datasets improves by 1.7% compared to the best methods in the same experimental settings and improves by 1.9% on the CIFAR-100 dataset.

Key words: Unsupervised learning, Clustering, Contrastive learning, Data Augmentation, Over clustering

中图分类号: 

  • TP183
[1]MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability.1967,1(14):281-297.
[2]ZELINIK-MANOR L,PERONA P.Self-Tuning Spectral Clustering[C]//Advances in Neural Information Processing Systems(NIPS).2004.
[3]CAI D,HE X,WANG X,et al.Locality preserving nonnegative matrix factorization[C]//Twenty-first International Joint Conference on Artificial Intelligence.2009.
[4]JI X,HENRIQUES J F,VEDALDI A.Invariant informationclustering for unsupervised image classification and segmentation[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2019:9865-9874.
[5]HUANG J,GONG S,ZHU X.Deep semantic clustering by partition confidence maximisation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8849-8858.
[6]LI Y,HU P,LIU Z,et al.Contrastive clustering[C]//2021AAAI Conference on Artificial Intelligence(AAAI).2021.
[7]DANG Z,DENG C,YANG X,et al.Doubly contrastive deepclustering[J].arXiv:2103.05484,2021.
[8]XIE J,GIRSHICK R,FARHADI A.Unsupervised deep embedding for clustering analysis[C]//International Conference on Machine Learning.PMLR,2016:478-487.
[9]CHANG J,WANG L,MENG G,et al.Deep adaptive imageclustering[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:5879-5887.
[10]WU J,LONG K,WANG F,et al.Deep comprehensive correlation mining for image clustering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:8150-8159.
[11]DO K,TRAN T,VENKATESH S.Clustering by maximizingmutual information across views[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:9928-9938.
[12]VAN GANSBEKE W,VANDENHENDE S,GEORGOULIS S,et al.Scan:Learning to classify images without labels[C]//European Conference on Computer Vision.Cham:Springer,2020:268-285.
[13]NIU C,SHAN H,WANG G.Spice:Semantic pseudo-labelingfor image clustering[J].arXiv:2103.09382,2021.
[14]ZhONG H,CHEN C,JIN Z,et al.Deep robust clustering by contrastive learning[J].arXiv:2008.03030,2020.
[15]GUTMANN M U,HYVARINEN A.Noise-Contrastive Estimation of Unnormalized Statistical Models,with Applications to Natural Image Statistics[J].Journal of machine learning research,2012,13(2):307-361.
[16]VAN DEN OORD A,LI Y,VINYALS O.Representation lear-ning with contrastive predictive coding[J].arXiv:1807.03748,2018.
[17]HE K,FAN H,WU Y,et al.Momentum contrast for unsuper-vised visual representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9729-9738.
[18]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//International Conference on Machine Learning.PMLR,2020:1597-1607.
[19]YAN Y,LI R,WANG S,et al.Consert:A contrastive framework for self-supervised sentence representation transfer[J].arXiv:2105.11741,2021.
[20]GAO T,YAO X,CHEN D.Simcse:Simple contrastive learning of sentence embeddings[J].arXiv:2104.08821,2021.
[21]GOWDA K C,KRISHNA G.Agglomerative clustering using the concept of mutual nearest neighbourhood[J].Pattern recognition,1978,10(2):105-112.
[22]BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy layer-wise training of deep networks[J].Advances in Neural Information Processing Systems,2006,19:153-160.
[23]VINCENT P,LAROCHELLE H,LAJOIE I,et al.Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion[J].Journal of Machine Learning Research,2010,11(12):3371-3408.
[24]ZEILER M D,KRISHNAN D,TAYLOR G W,et al.Deconvolutional networks[C]//2010 IEEE Computer Society Confe-rence on Computer Vision and Pattern Recognition.IEEE,2010:2528-2535.
[25]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[26]YANG J,PARIKH D,BATRA D.Joint unsupervised learning of deep representations and image clusters[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5147-5156.
[27]XIE J,GIRSHICK R,FARHADI A.Unsupervised deep embedding for clustering analysis[C]//International Conference on Machine Learning.PMLR,2016:478-487.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!