Computer Science ›› 2023, Vol. 50 ›› Issue (9): 192-201.doi: 10.11896/jsjkx.220900133

• Database & Big Data & Data Science • Previous Articles     Next Articles

Super Multi-class Deep Image Clustering Model Based on Contrastive Learning

HU Shen1,3, QIAN Yuhua1,2,3, WANG Jieting1,3, LI Feijiang1,3, LYU Wei1,3   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Shanxi University Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Taiyuan 030006,China
    3 Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China
  • Received:2022-08-02 Revised:2022-10-10 Online:2023-09-15 Published:2023-09-01
  • About author:HU Shen,born in 1997,postgraduate,is a student member of China Computer Federation.His main research interests include self-supervised learning and super multi-class image clustering.
    QIAN Yuhua,born in 1976,Ph.D,professor,is a member of China Computer Federation.His main research interests include artificial intelligence,big data,machine learning and data mining.
  • Supported by:
    Key Program of the National Natural Science Foundation of China(62136005), National Key Research and Development Program of China(2021ZD0112400),Young Scientists Fund of the National Natural Science Foundation of China(62106132),Program for the San Jin Young Scholars of Shanxi and Shanxi Provincial Research Foundation for Basic Research,China(20210302124271,202103021223026).

Abstract: Image clustering reduces the dimensionality of image data,extracts effective features through representation learning,and performs cluster analysis.When there are many categories of image data,the complexity of data distribution and the density of clusters seriously affect the practicability of existing methods.To this end,this paper proposes a super-multi-class deep image clustering model based on contrastive learning,which is mainly divided into three stages:firstly,improving the contrastive lear-ning method to train the feature model to make the cluster distribution uniform;secondly,based on the principle of semantic similarity,the perspective mines instance semantic nearest neighbor information;and finally,the instance and its nearest neighbors are used as self-supervised information to train a clustering model.According to the different types of experiments,ablation experiments and contrast experiments are designed in this paper.The ablation experiments prove that the proposed method could make the clusters evenly distributed in the mapping space and mine the semantic nearest neighbor information reliably.In the comparative experiments,it's compared with the advanced algorithms on 7 benchmark datasets.On the ImageNet-200 class dataset,it's accuracy is 10.6% higher than the advanced method.It's accuracy rate on the ImageNet-1000 class dataset is higher than that of the advanced algorithm,which improves by 9.2%.

Key words: Super multi-class clustering, Contrastive learning, Feature model, Semantic similarity, Image clustering

CLC Number: 

  • TP391
[1]REN X T,ZHAO J J,QIANG Y,et al.Lung Cancer SubtypeRecognition with Unsupervised Learning Combining Paired Learning and Image Clustering[J].Computer Science,2020,47(10):200-206.
[2]ZHU Z Q,GENG H J,QIAN Y H.Line-Segment Clustering Algorithm for Chemical Structure[J].Computer Science,2022,49(5):113-119.
[3]LU K,DING Z,GE S.Sparse-Representation-Based Graph Embedding for Traffic Sign Recognition[J].IEEE Transactions on Intelligent Transportation Systems,2012,13(4):1515-1524.
[4]LIU X,WANG G Y,LUO X B.Multi-granularity Clustering of Remote Sensing Image Based on Gaussian Cloud Transformation[J].Computer Science,2017,44(9):23-27,52.
[5]HUANG M,LI T,WEN X,et al.Induction Motor ParameterIdentification Incorporating Image Recognition and Cluster Ana-lysis[J].Journal of Chongqing University of Technology(Natural Science),2022,36(9):195-201.
[6]PARK S,HAN S,KIM S,et al.Improving Unsupervised Image Clustering With Robust Learning [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Press,2021:15750-15758.
[7]NIU C,SHAN H,WANG G.SPICE:Semantic Pseudo-labeling for Image Clustering[J].arXiv:2103.09382,2021.
[8]COATES A,LEE H,NG A Y.An Analysis of Single-LayerNetworks in Unsupervised Feature Learning[C]//Proceedings of the fourteenth International Conference on Artificial Intelligence and Statistics.Lauderdale:JMLR Press,2011:215-223.
[9]KRIZHEVSKY A.Learning Multiple Layers of Features fromTiny Images[R].University of Toronto,Technical Report, 2009.
[10]TAO Y,TAKAGI K,NAKATA K.Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation[C]//Proceedings of the International Conference on Learning Representations.ICLR Press,2021.
[11]HUANG Z,CHEN J,ZHANG J,et al.Learning Representation for Clustering via Prototype Scattering and Positive Sampling[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(6):7509-7524.
[12]DESHMUKH A A,REGATTI J R,MANAVOGLU E,et al.Representation Learning for Clustering via Building Consensus[J].Springer Machine Learning Journal,2022,111(12):4601-4638.
[13]CHANG J,WANG L,MENG G,et al.Deep Adaptive ImageClustering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Venice:IEEE Press,2017:5880-5888.
[14]DENG J,DONG W,SOCHER R,et al.ImageNet:A Large-Scale Hierarchical Image Database[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Miami:IEEE Press,2009:248-255.
[15]LI F,QIAN Y,WANG J,et al.Clustering method based on sample's stability[J].SCIENTIA SINICA Informationis,2020,50(8):1239-1254.
[16]WANG J,QIAN Y,LI F,et al.Support Vector Machine withEliminating the Random Consistency[J].Journal of Computer Research and Development,2020,57(8):1581-1593.
[17]LI F,QIAN Y,WANG J,et al.Clustering ensemble based on sample's stability[J].Artificial Intelligence,2019,273:37-55.
[18]WANG J,QIAN Y,LI F,et al.Generalization Performance of Pure Accuracy and Its Application in Selective Ensemble Lear-ning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):1798-1816.
[19]ZHANG H,ZHANG J,HUANG J.Multi--label Image Classification Model Based on Graph Attention Network[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2022,39(1):34-41.
[20]MENG L,TAN A H,MIAO C.Salience-aware adaptive resonance theory for large-scale sparse data clustering[J].Neural Networks,2019,120:143-157.
[21]MENG L,TAN A H,WUNSCH D C.Vigilance adaptation inadaptive resonance theory[C]//Proceedings of the the 2013 International Joint Conference on Neural Networks.IEEE Press,2013:1-7.
[22]MENG L,TAN A H,XU D.Semi-Supervised HeterogeneousFusion for Multimedia Data Co-Clustering[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(9):2293-2306.
[23]MENG L,TAN A H,WUNSCH D C.Adaptive Scaling of Cluster Boundaries for Large-Scale Social Media Data Clustering[J].IEEE Transactions on Neural Networks and Learning Systems,2016,27(12):2656-2669.
[24]CARPENTER G A,GROSSBERG S.A Massively Parallel Ar-chitecture for a Self-Organizing Neural Pattern Recognition Machine[J].Computer Vision,Graphics,and Image Processing,1987,37(1):54-115.
[25]CARPENTER G A,GROSSBERG S,ROSEN D B.Fuzzy ART:Fast stable learning and categorization of analog patterns by an adaptive resonance system[J].Neural Networks,1991,4(6):759-771.
[26]HUANG P,HUANG Y,WANG W,et al.Deep Embedding Network for Clustering[C]//Proceedings of the 22nd International Conference on Pattern Recognition.Washington:IEEE Press,2014:1532-1537.
[27]XIE J,GIRSHICK R,FARHADI A.Unsupervised Deep Em-bedding for Clustering Analysis[C]//Proceedings of the International Conference On Machine Learning.New York:ACM Press,2016:478-487.
[28]YANG B,FU X,SIDIROPOULOS N D,et al.TowardsK-means-friendly Spaces:Simultaneous Deep Learning and Clustering[C]//Proceedings of the International Conference On Machine Learning.New York:ACM Press,2017:3861-3870.
[29]JI P,ZHANG T,LI H,et al.Deep Subspace Clustering Networks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems.MIT Press,2017.
[30]ZHOU P,HOU Y,FENG J.Deep Adversarial Subspace Clustering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:1596-1604.
[31]DIZAJI K G,HERANDI A,DENG C,et al.Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Venice:IEEE Press,2017:5747-5756.
[32]YANG J,PARIKH D,BATRA D.Joint Unsupervised Learning of Deep Representations and Image Clusters[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:5147-5156.
[33]CARON M,BOJANOWSKI P,JOULIN A,et al.Deep Clustering for Unsupervised Learning of Visual Features[C]//Proceedings of the European conference on computer vision.Cham:Springer International Publishing,2018:139-156.
[34]WU J,LONG K,WANG F,et al.Deep Comprehensive Correlation Mining for Image Clustering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul:IEEE Press,2019:8149-8158.
[35]JI X,VEDALDI A,HENRIQUES J.Invariant Information Clustering for Unsupervised Image Classification and Segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul:IEEE Press,2019:9864-9873.
[36]HUANG J,GONG S,ZHU X.Deep Semantic Clustering byPartition Confidence Maximisation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE Press,2020:8846-8855.
[37]VAN GANSBEKE W,VANDENHENDE S,GEORGOULIS S,et al.SCAN:Learning to Classify Images Without Labels[C]//Proceedings of the European conference on computer vision.Cham:Springer International Publishing,2020:268-285.
[38]LI Y,HU P,LIU Z,et al.Contrastive Clustering[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2021:8547-8555.
[39]HE K,FAN H,WU Y,et al.Momentum Contrast for Unsupervised Visual Representation Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE Press,2020:9729-9738.
[40]CHEN T,KORNBLITH S,NOROUZI M,et al.A SimpleFramework for Contrastive Learning of Visual Representations[C]//Proceedings of the International Conference on Machine Learning.New York:ACM Press,2020:1597-1607.
[41]GRILL J B,STRUB F,ALTCHÉ F,et al.Bootstrap your own latent:A new approach to self-supervised Learning[C]//Proceedings of the Annual Conference on Neural Information Processing Systems.MIT Press,2020:21271-21284.
[42]CHEN X,HE K.Exploring Simple Siamese RepresentationLearning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Press,2021:15750-15758.
[43]WANG T,ISOLA P.Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere[C]//Proceedings of the International Conference On Machine Learning.New York:ACM Press,2020:9929-9939.
[44]DEVRIES T,TAYLOR G W.Improved Regularization of Con-volutional Neural Networks with Cutout[J].arXiv:1708.04552,2017.
[45]CUBUK E D,ZOPH B,SHLENS J,et al.Randaugment:Practical automated data augmentation with a reduced search space[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.IEEE Press,2020:3008-3017.
[46]WANG F,LIU H.Understanding the Behaviour of Contrastive Loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE Press,2021:2495-2504.
[47]SOHN K,BERTHELOT D,LI C L,et al.FixMatch:Simplifying Semi-Supervised Learning with Consistency and Confıdence[J].Advances in neural information processing systems,2020,33:596-608.
[48]WANG Y,HUANG H,RUDIN C,et al.Understanding HowDimension Reduction Tools Work:An Empirical Approach to Deciphering t-SNE,UMAP,TriMAP,and PaCMAP for Data Visualization[J].Journal of Machine Learning Research,2021,22(201):1-73.
[49]ZELNIK-MANOR L,PERONA P.Self-Tuning Spectral Clustering[C]//Proceedings of the Annual Conference on Neural Information Processing Systems.MIT Press,2004.
[50]BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy Layer-Wise Training of Deep Networks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems.MIT Press,2006.
[51]KINGMA D P,WELLING M.Auto-Encoding Variational Bayes[C]//Proceedings of the International Conference on Learning Representations.ICLR Press,2014.
[52]ZHONG H,WU J,CHEN C,et al.Graph Contrastive Clustering[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Montreal:IEEE Press,2021:9204-9213.
[53]LI J,ZHOU P,XIONG C,et al.Prototypical Contrastive Lear-ning of Unsupervised Representations[C]//Proceedings of the International Conference on Learning Representations.ICLR Press,2021.
[54]TSAI T W,LI C,ZHU J.MiCE:Mixture of Contrastive Experts for Unsupervised Image Clustering[C]//Proceedings of the International Conference on Learning Representations.ICLR Press,2021.
[55]COHEN N,HOSHEN Y.The Single-Noun Prior for ImageClustering[J].arXiv:2104.03952,2021.
[56]CHEN X,FAN H,GIRSHICK R,et al.Improved Baselines with Momentum Contrastive Learning[J].arXiv:2003.04297,2020.
[1] XU Jie, WANG Lisong. Contrastive Clustering with Consistent Structural Relations [J]. Computer Science, 2023, 50(9): 123-129.
[2] LI Xiang, FAN Zhiguang, LIN Nan, CAO Yangjie, LI Xuexiang. Self-supervised Learning for 3D Real-scenes Question Answering [J]. Computer Science, 2023, 50(9): 220-226.
[3] WANG Mingxia, XIONG Yun. Disease Diagnosis Prediction Algorithm Based on Contrastive Learning [J]. Computer Science, 2023, 50(7): 46-52.
[4] WU Jufeng, ZHAO Xungang, ZHOU Qiang, RAO Ning. Contrastive Learning for Low-light Image Enhancement [J]. Computer Science, 2023, 50(6A): 220600171-6.
[5] HE Chao, CHEN Jinjie, JIN Zhao, LEI Yinjie. Automatic Modulation Recognition Method Based on Multimodal Time-Frequency Feature Fusion [J]. Computer Science, 2023, 50(4): 226-232.
[6] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[7] HAN Jie, CHEN Jun-fen, LI Yan, ZHAN Ze-cong. Self-supervised Deep Clustering Algorithm Based on Self-attention [J]. Computer Science, 2022, 49(3): 134-143.
[8] YUAN De-sen, LIU Xiu-jing, WU Qing-bo, LI Hong-liang, MENG Fan-man, NGAN King-ngi, XU Lin-feng. Visual Question Answering Method Based on Counterfactual Thinking [J]. Computer Science, 2022, 49(12): 229-235.
[9] LUO Yue-tong, WANG Tao, YANG Meng-nan, ZHANG Yan-kong. Historical Driving Track Set Based Visual Vehicle Behavior Analytic Method [J]. Computer Science, 2021, 48(9): 86-94.
[10] WANG Sheng, ZHANG Yang-sen, CHEN Ruo-yu, XIANG Ga. Text Matching Method Based on Fine-grained Difference Features [J]. Computer Science, 2021, 48(8): 60-65.
[11] CHEN Yang, WANG Jin-liang, XIA Wei, YANG Hao, ZHU Run, XI Xue-feng. Footprint Image Clustering Method Based on Automatic Feature Extraction [J]. Computer Science, 2021, 48(6A): 255-259.
[12] LI Xiang-li, JIA Meng-xue. Nonnegative Matrix Factorization Algorithm with Hypergraph Based on Per-treatments [J]. Computer Science, 2020, 47(7): 71-77.
[13] WANG Cheng-zhang, BAI Xiao-ming, DU Jin-li. Diffuse Interface Based Unsupervised Images Clustering Algorithm [J]. Computer Science, 2020, 47(5): 149-153.
[14] ZHANG Yun-fan,ZHOU Yu,HUANG Zhi-qiu. Semantic Similarity Based API Usage Pattern Recommendation [J]. Computer Science, 2020, 47(3): 34-40.
[15] MA Xiao-hui, JIA Jun-zhi, ZHOU Xiang-zhen, YAN Jun-ya. Semantic Similarity-based Method for Sentiment Classification [J]. Computer Science, 2020, 47(11): 275-279.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!