计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 153-160.doi: 10.11896/jsjkx.230800200
何一帆1, 何玉林2, 崔来中1,2, 黄哲学1,2
HE Yifan1, HE Yulin2, CUI Laizhong1,2, HUANG Zhexue1,2
摘要: 高维数据的子空间聚类是无监督学习领域的热点研究问题,其难点在于寻找恰当的子空间以及其中的数据簇。大多数现有的子空间聚类算法均存在计算复杂度高和参数选择难的缺陷,这是因为在高维数据中子空间的组合数量很大,算法的执行时间非常长,且不同数据集和应用场景需要不同的参数设定。为此,提出了基于子空间的I-nice(简记为sub-I-nice)聚类算法用于识别高维数据中子空间内数据簇的个数。首先,该算法将原始数据维度随机划分成多个维度组,根据维度组生成子空间样本;接着,使用最新的I-niceMO算法对每个子空间数据进行聚类;最后,采用新设计的球模型对所有子空间的基聚类结果进行集成。在含有噪声的高维仿真数据集上对所提出的sub-I-nice算法进行了详细的性能验证,实验结果表明sub-I-nice算法相比其他3种代表性聚类算法有更好的准确性和鲁棒性,从而证实了其合理性和有效性。
中图分类号:
[1]PARSONS L,HAQUE E,LIU H.Subspace clustering for high dimensional data:a review[J].ACM SIGKDD Explorations Newsletter,2004,6(1):90-105. [2]XU R,WUNSCH D.Survey of clustering algorithms[J].IEEE Transactions on Neural Networks,2005,16(3):645-678. [3]ZHU Y,TING K M,CARMAN M J.Grouping points by shared subspaces for effective subspace clustering[J].Pattern Recognition,2018,83:230-244. [4]YU Z,LUO P,YOU J,et al.Incremental semi-supervised clustering ensemble for high dimensional data clustering[J].IEEE Transactions on Knowledge and Data Engineering,2015,28(3):701-714. [5]GOLALIPOUR K,AKBARI E,HAMIDI S S,et al.From clustering to clustering ensemble selection:A review[J].Enginee-ring Applications of Artificial Intelligence,2021,104:104388. [6]CHEN X,YE Y,XU X,et al.A feature group weighting method for subspace clustering of high-dimensional data[J].Pattern Recognition,2012,45(1):434-446. [7]PAUL D,SAHA S,MATHEW J.Improved subspace clustering algorithm using multi-objective framework and subspace optimization[J].Expert Systems with Applications,2020,158:113487. [8]GAN G,NG M K P.Subspace clustering with automatic feature grouping[J].Pattern Recognition,2015,48(11):3703-3713. [9]LIU Y,JIAO L C,SHANG F.Anefficient matrix factorizationbased low-rank representation for subspace clustering[J].Pattern Recognition,2013,46(1):284-292. [10]ZHU P,ZHU W,HU Q,et al.Subspace clustering guided unsupervised feature selection[J].Pattern Recognition,2017,66:364-374. [11]MASUD M A,HUANG J Z,WEI C,et al.I-nice:A new approach for identifying the number of clusters and initial cluster centres[J].Information Sciences,2018,466:129-151. [12]AGGARWAL C C,WOLF J L,YU P S,et al.Fast algorithms for projected clustering[J].ACM SIGMOD Record,1999,28(2):61-72. [13]AGGARWAL C C,YU P S.Finding generalized projected clusters in high dimensional spaces[C]//Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data.2000:70-81. [14]GOIL S,NAGESH H,CHOUDHARY A.Mafia:Efficient and scalable subspace clustering for very large data sets[C]//Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.1999:443-452. [15]CHENG C H,FU A W,ZHANG Y.Entropy-based subspaceclustering for mining numerical data[C]//Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.1999:84-93. [16]AGRAWAL R,GEHRKE J,GUNOPULOS D,et al.Automatic subspace clustering of high dimensional data for data mining applications[C]//Proceedings of the fourth ACM SIGMOD International Conference on Management of Data.1998:94-105. [17]KAILING K,KRIEGELH P,KRÖGER P.Density-connectedsubspace clustering for high-dimensional data[C]//Proceedings of the 2004 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2004:246-256. [18]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the 1996 ACM SIGKDD Conference on Knowledge Discovery And Data Mining.1996:226-231. [19]HAHSLER M,PIEKENBROCK M,DORAN D.DBSCAN:Fast density-based clustering with R[J].Journal of Statistical Software,2019,91:1-30. [20]HUANG X,YE Y,GUO H,et al.DSKmeans:a new kmeans-type approach to discriminative subspace clustering[J].Know-ledge-Based Systems,2014,70:293-300. [21]ZOGRAFOS V,ELLIS L,MESTER R.Discriminative subspace clustering[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition.2013:2107-2114. [22]YAMAGUCHI M,IRIE G,KAWANISHI T,et al.Subspacestructure-aware spectral clustering for robust subspace clustering[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision.2019:9875-9884. [23]PASSALIS N,TEFAS A.Discriminative clustering using regularized subspace learning[J].Pattern Recognition,2019,96:106982. [24]XIE J,GIRSHICK R,FARHADI A.Unsupervised deep embedding for clustering analysis[C]//Proceedings of the 2016 International Conference on Machine Learning.2016:478-487. [25]HERSHEY J R,CHEN Z,LE ROUX J,et al.Deep clustering:Discriminative embeddings for segmentation and separation[C]//Proceedings of the 2016 IEEE International Conference on Acoustics,Speech and Signal Processing.2016:31-35. [26]CHEN Y,XIAO X,ZHOU Y.Multi-view subspace clustering via simultaneously learning the representation tensor and affinity matrix[J].Pattern Recognition,2020,106:107441. [27]WANG C,CHEN X,NIE F,et al.Directly solving normalized cut for multi-view data[J].Pattern Recognition,2022,130:108809. [28]ELHAMIFAR E,VIDAL R.Sparse subspace clustering:Algorithm,theory,and applications[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(11):2765-2781. [29]LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):171-184. [30]SHAO C L,SUN T F,DING S F.Ensemble clustering based on information entropy weighted[J].Journal of Nanjing University(Natural Sciences),2021,57(2):189-196. [31]DUH Y,ZHANG J,WANG W J.A deep self-supervised clustering ensemble algorithm[J].CAAI Transactions on Intelligent Systems,2020,15(6):1113-1120. [32]ZHOU P,WANG X,DU L,et al.Clustering ensemble via structured hypergraph learning[J].Information Fusion,2022,78:171-179. [33]THORNDIKE R.Who belongs in the family?[J].Psychometri-ka,1953,18(4):267-276. [34]ROUSSEEUW P J.Silhouettes:a graphical aid to the interpretation and validation of cluster analysis[J].Journal of Computational and Applied Mathematics,1987,20:53-65. [35]HUBERT L,ARABIE P.Comparing partitions[J].Journal of Classification,1985,2(1):193-218. |
|