Computer Science ›› 2018, Vol. 45 ›› Issue (11A): 474-479.

• Big Data & Data Mining • Previous Articles     Next Articles

Heuristically Determining Cluster Numbers Based NJW Spectral Clustering Algorithm

CHEN Jun-fen1, ZHANG Ming1, HE Qiang2   

  1. Hebei Key Laboratory of Machine Learning and Computational Intelligence,College of Mathematics and Information Sciences,Hebei University,Baoding 071002,China1
    School of Science,Beijing University of Civil Engineering and Architecture,Beijing,100044,China2
  • Online:2019-02-26 Published:2019-02-26

Abstract: The main idea of NJW is to project data points into feature space and then to cluster using K-means algorithm,while the clustering results is considered as the clustering of the original data points.However,the number of clusters C and scaling parameter σ of Gaussian kernel function greatly influence the clustering performance of NJW algorithm,whilst the fact that K-means algorithm is sensitive to initial cluster centers also influences the clustering result of NJW algorithm.To this end,an improved NJW algorithmis presented referring to DP-NJW algorithm which heuristically determines cluster-center points and the number of clusters according to density distribution of data points,and then implements NJW algorithm to cluster.Note that,the obtained cluster-center points and the number of clusters in the first stage are the initialization of K-means algorithm in the second stage of the proposed DP-NJW algorithm.In the next section,DP-NJW is compared with state-of-the-art clustering algorithms on the given seven public datasets,where DP-NJW provided higher clustering accuracy than NJW on five datasets and even on the other two datasets.DP-NJW algorithm provided better clustering performance than DPC on the given five datasets.In addition,DP-NJW consumed less computing time than the other two algorithms,and this is especially obvious on the larger aggregation dataset.Overall,DP-NJW algorithm is superior to the state-of-the-art clustering algorithms.

Key words: Clustering accuracy, Gaussian kernel function, NJW clustering, Scaling parameter, Spectral clustering

CLC Number: 

  • TP181
[1]MACQUEENJ B.Some Methods for Classification and Analysis of Multivariate Observations[C]∥Proceedings of Berkeley Symposium on Mathematical Statistics and Probability.1967:281-297.
[2]MACQUEENJ B.The Classification Problem[Z].Western Management Science Institute Working Paper,1962.
[3]COXD R.Note on Grouping[J].Journal of the American Statistical Association,1957,52(280):543-547.
[4]FISHERW D.On Grouping for Maximum Homogeneity[J]. Journal of the American Statistical Association,1958,53(284):789-798.
[5]SEBESTYENG S.Decision Making Process in Pattern Recognition[M].New York:Macmillan,1962.
[6]JAINA K.Data Clustering:50 Years Beyond K-means[J].Pattern Recognition Letters,2010,31(8):651-666.
[7]BOCK H H.Clustering Methods:A History of K-means Algorithms[M]∥Selected Contributions in Data Analysis and Classification.Springer Berlin Heidelberg,2007:161-172.
[8]王千,王成,冯振元,等.K-means聚类算法研究综述[J].电子设计工程,2012,20(7):21-24.
[9]PERONAP,FREEMAN W.A Factorization Approach to Grou-ping [C]∥Proceedings of European Conference on Computer Vision (ECCV).1998:655-670.
[10]NGA Y,JORDANM I,WEISS Y.On Spectral Clustering:Analysis and an Algorithm[C]∥Proceedings of the 14th Advances in Neural Information Processing System.2002:849-856.
[11]LI C G,YOU C,VIDAL R.Structured Sparse Subspace Clustering:A Joint Affinity Learning and Subspace Clustering Framework[J].IEEE Transactions on Image Processing,2017,26(6):2988-3001.
[12]LI C G,YOU C,VIDAL R.On Geometric Analysis of Affine Sparse Subspace Clustering[J].IEEE Journal on Selected Topics in Signal Processing,2018,12(6).
[13]LI C G,ZHANG J J,GUO J.Constrained Sparse Subspace Clustering with Side Information[C]∥Proceedings of the 24th International Conference on Pattern Recognition (ICPR).2018:2093-2099.
[14]LIANG J Q,HAN Y H,HU Q H.Semi-Supervised Image Clustering with Multimodal Information[J].Multimedia Systems,2016,22:149-160.
[15]LUXBURGU V.A Tutorial on Spectral Clustering[J].Statistics & Computing,2007,17(4):395-416.
[16]金建国.聚类方法综述[J].计算机科学,2014,41(S2):288-293.
[17]ZELNIK-MANORL,PERONA P.Self-Tuning Spectral Clustering[C]∥Proceedings of the 16th Advances in Neural Information Processing System.2004:1601-1608.
[18]WANG F,ZHANG C S.Robust Self-Tuning Semi-Supervised Learning[J].Neurocomputing,2007,70(16):2931-2939.
[19]YANG C,ZHANG X,JIAO L,et al.Self-Tuning Semi-Super-vised Spectral Clustering[C]∥International Conference on Computational Intelligence & Security.2008:1-5.
[20]KUMARV,HAHNJ,ZOUBIR A M.Band Selection for Hyperspectral Images Based on Self-Tuning Spectral Clustering[C]∥Proceedings of the 21st EuropeanSignal Processing Conference (EUSIPCO).2013.
[21]POLITOM,PERONAP.Grouping and Dimensionality Reduction by Locally Linear Embedding[C]∥Proceedings of International Conference on Neural Information Processing Systems:Natural &Synthetic.2001:1255-1262.
[22]ZENGC P,ZHOUA M,ZHANGG X.Self-adaptive Spectra Cluster Number Detecting with Particle Swarm Optimization Algorithm[C]∥Evolutionary Computation.IEEE,2016:4607-4611.
[23]CHAN P,SCHLAG M,ZIEN S.Spectral K-Way Ratio-Cut Partitioning and Clustering[C]∥Proceedings of Conference on Design Automation.1993:749-754.
[24]TENENBAUM J B,SILVAV D,LANGFORD J C.A Globa Geometric Framework for Nonlinear Dimensionality Reduction[J].Science,2000,290(5500):2319-2323.
[25]RODRIGUEZ A,LAIO A.Clustering by Fast Search and Find of Density Peaks[J].Science,2014,344(6191):1492-1496.
[26]李金泽,徐喜荣,潘子琦,等.改进的自适应谱聚类NJW 算法[J].计算机科学,2017,44(S1):424-427.
[27]周海松,黄德才.密度自适应的半监督谱聚类算法[J].计算机科学,2016,43(12):209-212.
[28]杨虎,付宇,范丹.噪音特征对聚类内部有效性的影响[J].计算机科学,2018,45(7):22-30.
[1] GUO Yi-shan, LIU Man-dan. Anomaly Detection Based on Spatial-temporal Trajectory Data [J]. Computer Science, 2021, 48(6A): 213-219.
[2] LI Peng, LIU Li-jun, HUANG Yong-dong. Landmark-based Spectral Clustering by Joint Spectral Embedding and Spectral Rotation [J]. Computer Science, 2021, 48(6A): 220-225.
[3] ZHANG Xiao-qin, AN Xiao-dan, CAO Fu-yuan. Detecting Community from Bipartite Network Based on Spectral Clustering [J]. Computer Science, 2019, 46(4): 216-221.
[4] LIU Shu-dong, WEI Jia-min. Multilayer Perceptron Classification Algorithm Based on Spectral Clusteringand Simultaneous Two Sample Representation [J]. Computer Science, 2019, 46(11A): 194-198.
[5] YANG Hu, FU Yu, FAN Dan. Influence of Noisy Features on Internal Validation of Clustering [J]. Computer Science, 2018, 45(7): 22-30.
[6] WANG Ying and YANG Yu-wang. KNN Similarity Graph Algorithm Based on Heap and Neighborhood Coexistence [J]. Computer Science, 2018, 45(5): 196-200.
[7] CHANG Jia-wei, DAI Mu-hong. Personalized Recommendation Algorithm Based on PageRank and Spectral Method [J]. Computer Science, 2018, 45(11A): 398-401.
[8] LI Peng-qing, LI Yang-ding, DENG Xue-lian, LI Yong-gang, FANG Yue. Spectral Clustering Algorithm Based on SimRank Score [J]. Computer Science, 2018, 45(11A): 458-461.
[9] WANG Ping-xin, LIU Qiang, YANG Xi-bei and MI Ju-sheng. Three-way Clustering Analysis Based on Dynamic Neighborhood [J]. Computer Science, 2018, 45(1): 62-66.
[10] LI Jin-ze, XU Xi-rong, PAN Zi-qi and LI Xiao-jie. Improved Adaptive Spectral Clustering NJW Algorithm [J]. Computer Science, 2017, 44(Z6): 424-427.
[11] LI Xiao-lun and DING Zhi-jun. Group Travel Trip Recommendation Method in LBSNs [J]. Computer Science, 2017, 44(6): 199-205.
[12] QIN Xiao, LIANG Wei, YUAN Chang-an and TANG Tao. Image Segmentation Algorithm of Spectral Clustering Optimized by Genetic [J]. Computer Science, 2017, 44(1): 100-102.
[13] FAN Jing, RUAN Ti-hong, WU Jia-min and DONG Tian-yang. Vehicle Behavior Recognition Method Based on Quadratic Spectral Clustering and HMM-RF Hybrid Model [J]. Computer Science, 2016, 43(5): 288-293.
[14] WANG You-hua and CHEN Xiao-rong. Improved Text Clustering Algorithm Based on Kolmogorov Complexity [J]. Computer Science, 2016, 43(5): 243-246.
[15] ZHOU Hai-song and HUANG De-cai. Density Self-adaption Semi-supervised Spectral Clustering Algorithm [J]. Computer Science, 2016, 43(12): 209-212.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!