Computer Science ›› 2020, Vol. 47 ›› Issue (9): 324-329.doi: 10.11896/jsjkx.200700092

• Information Security • Previous Articles     Next Articles

Extended Algorithm of Pairwise Constraints Based on Security

YANG Fan1, WANG Jun-bin1, BAI Liang1,2   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory Computation Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China
  • Received:2020-04-30 Published:2020-09-10
  • About author:YANG Fan,born in 1995,postgraduate.Her main research interests include semi-supervised learning and so on.
    BAI Liang,born in 1982,Ph.D,professor,is a member of China Computer Fedaration.His main research interest include cluster analysis and so on.
  • Supported by:
    National Natural Science Foundation of China (61773247,61876103) and Technology Research Development Projects of Shanxi (201901D211192).

Abstract: Cluster analysis based on pairwise constraints is an important research direction of semi-supervised learning.The number of pairwise constraints has become an important factor affecting the effectiveness of this type of the algorithm.However,in practical applications,the acquisition of pairwise constraints requires a lot of costs.Therefore,the extended algorithm of pairwise constraints based on security (PCES) is proposed.This algorithm takes the maximum local connected distance in the transitive closures as the safe value.According to the safe value,the similarity between the different transitive closures is modified to reduce the risk of merging transitive closures.Finally,the method of graph clustering is used to merge similar transitive closures to extend the pairwise constraints.This algorithm can not only safely and effectively expand pairwise constraints,but also apply the extended pairwise constraints to different semi-supervised clustering algorithms.This paper compares the extended algorithm of pairwise constraints on eight benchmark data sets.The experimental results show that the proposed algorithm can extend pairwise constraints safely and effectively.

Key words: Effectiveness of supervision information, Expansion of supervision information, Pairwise constraints, Semi-supervised clustering

CLC Number: 

  • TP391
[1] WU X,KUMAR V,QUINLAN J R,et al.Top 10 algorithms in data mining[J].Knowledge andInformation Systems,2008,14(1):1-37.
[2] JAIN A K.Data clustering:50 years beyond K-means[J].Pattern Recognition Letters,2010,31(8):651-666.
[3] GALLEGO A J,CALVO-ZARAGOZA J,VALERO-MAS J J,et al.Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation[J].Pattern Recognition,2018,74:531-543.
[4] PATERLINI S,KRINK T.Differential evolution and particleswarm optimisation in partitional clustering[J].ComputationalStatistics & Data Analysis,2006,50(5):1220-1247.
[5] LU Y,WAN Y.PHA:A fast potential-based hierarchical ag-glomerative clustering method[J].Pattern Recognition,2013,46(5):1227-1239.
[6] KUMAR K M,REDDY A R M.A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups me-thod[J].Pattern Recognition,2016,58:39-48.
[7] DINLER D,TURAL M K.Robust semi-supervised clusteringwith polyhedral and circular uncertainty[J].Neurocomputing,2017,265:4-27.
[8] ZHOU X,BELKIN M.Semi-supervised learning[J].Academic Press Library in Signal Processing,2014,1:1239-1269.
[9] ZHOU D,BOUSQUET O,LAL T N,et al.Learning with local and global consistency[J].Advances in Neural Information Processing Systems,2003,16(3):321-328.
[10] WANG H,LI T,LI T,et al.Constraint neighborhood projections for semi-supervised clustering[J].IEEETransactions on Cybernetics,2014,44(5):636-643.
[11] ZHOU Z H.Ensemble methods:foundations and algorithms[M].Boca Raton:CRC Press,2012:72-73.
[12] WAGSTAFF K,CARDIE C,ROGERS S,et al.Constrained k-means clustering with background knowledge[C]// Proceedings of the Eighteenth International Conference on Machine Lear-ning.San Francisco:Morgan Kaufmann,2001:577-584.
[13] YANG Y,TAN W,LI T,et al.Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems[J].Knowledge-Based Systems,2012,32:101-115.
[14] WEI S,LI Z,ZHANG C.A semi-supervised clustering ensemble approach integrated constraint-based and metric-based[C]//Proceedings of the 7th International Conference on Internet Multimedia Computing and Service.New York:Association for Computing Machinery,2015:1-6.
[15] LI T,DING C,JORDAN M I.Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization[C]//Seventh IEEE International Conference on Data Mi-ning (ICDM 2007).Piscataway:IEEE,2007:577-582.
[16] REN Y,HU X,SHI K,et al.Semi-supervised denpeak clustering with pairwise constraints[C]//Pacific Rim InternationalConfe-rence on Artificial Intelligence.Berlin:Springer,2018:837-850.
[17] MIYAMOTO S,TERAMI A.Constrained agglomerative hierarchical clustering algorithms with penalties [C]//2011 IEEE International Conference on Fuzzy Systems.Piscataway:IEEE,2011:422-427.
[18] KAMVAR K,SEPANDAR S,KLEIN K,et al.Spectral learning[C]//International Joint Conference of Artificial Intelligence.Stanford Info Lab,2003:561-566.
[19] XU Q,DESJARDINS M,WAGSTAFF K.Constrained spectral clustering under a local proximity structure assumption[C]//Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference.Clearwater Beach,Florida,USA:DBLP,2005:866-867.
[20] JI X,XU W.Document clustering with prior knowledge[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:Association for Computing Machinery,2006:405-412.
[21] WANG F,DING C,LI T.Integrated KL (K-means-Laplacian) clustering:a new clustering approach by combining attribute data and pairwise relations[C]//Proceedings of the 2009 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics.2009:38-48.
[22] KULIS B,BASU S,DHILLON I,et al.Semi-supervised graph clustering:a kernel approach[J].MachineLearning,2009,74(1):1-22.
[23] DING S,JIA H,DU M,et al.A semi-supervised approximate spectral clustering algorithm based on HMRF model[J].Information Sciences,2018,429:215-228.
[24] KLEIN D,KAMVAR S D,MANNING C D.From instance-level constraints to space-level constraints:Making the most of prior knowledge in data clustering[C]//International Conference on Machine Learning.Stanford:IMLS,2002:307-314.
[25] LU Z,PENG Y.Exhaustive and efficient constraint propaga-tion:A graph-based learning approach and its applications[J].International Journal of Computer Vision,2013,103(3):306-325.
[26] WANG L,BO L F,JIAO L C.Density-sensitive semi-supervised spectral clustering[J].Journal of Software,2007,18(10):2412-2422.
[27] WEI S,LI Z,ZHANG C.Combined constraint-based with metric-based in semi-supervised clustering ensemble[J].International Journal of Machine Learning and Cybernetics,2018,9(7):1085-1100.
[28] YU Z,KUANG Z,LIU J,et al.Adaptive ensembling of semi-supervised clustering solutions[J].IEEE Transactions onKnow-ledge and Data Engineering,2017,29(8):1577-1590.
[29] EISENBERG D,MARCOTTE E M,XENARIOS I,et al.Protein function in the post-genomic era[J].Nature,2000,405(6788):823-826.
[30] WAGSTAFF K,CARDIE C.Clustering with instance-level constraints[J].AAAI/IAAI,2000,1097:577-584.
[31] TENENBAUM J B,DE SILVA V,LANGFORD J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323.
[32] CLAUSET A,NEWMAN M E J,MOORE C.Finding community structure in very large networks[J].Physical Review E,2004,70(6):066111.
[33] DOM B E.An information-theoretic external cluster-validitymeasure[J].Uncertainty in Artificial Intelligence,2001,27(3):137-145.
[34] HUBERT L,ARABIE P.Comparing partitions[J].Journal of Classification,1985,2(1):193-218.
[1] ZHAO Min, LIU Jing-lei. Semi-supervised Clustering Based on Gaussian Fields and Adaptive Graph Regularization [J]. Computer Science, 2021, 48(7): 137-144.
[2] QIN Yue, DING Shi-fei. Survey of Semi-supervised Clustering [J]. Computer Science, 2019, 46(9): 15-21.
[3] WANG Nan, SUN Shan-wu. UAV Fault Recognition Based on Semi-supervised Clustering [J]. Computer Science, 2019, 46(6A): 192-195.
[4] JIA Hong-jie, WANG Liang-jun, SONG He-ping. HMRF Semi-supervised Approximate Kernel k-means Algorithm [J]. Computer Science, 2019, 46(12): 31-37.
[5] CHENG Xue-mei, YANG Qiu-hui, ZHAI Yu-peng and CHEN Wei. Test Case Selection Technique Based on Semi-supervised Clustering Method [J]. Computer Science, 2018, 45(1): 249-254.
[6] LIANG Chen and LI Cheng-hai. Novel Intrusion Detection Method Based on Semi-supervised Clustering [J]. Computer Science, 2016, 43(5): 87-90.
[7] WANG Zong-hu and LIU Su. Pairwise Constrained Semi-supervised Text Clustering Algorithm [J]. Computer Science, 2016, 43(12): 183-188.
[8] FENG Chen-fei, YANG Yan, WANG Hong-jun, XU Ying-ge and WANG Tao. Semi-supervised Fuzzy Clustering Ensemble Approach with Data Correlation [J]. Computer Science, 2015, 42(6): 41-45.
[9] SU Ying-bin, DU Xue-hui, XIA Chun-tao, CAO Li-feng and CHEN Hua-cheng. Sensitive Information Inference Method Based on Semi-supervised Document Clustering [J]. Computer Science, 2015, 42(10): 132-137.
[10] . Pairwise Constraint Projections Inosculating Sparsity Preserving [J]. Computer Science, 2012, 39(11): 212-215.
[11] CUI Peng,ZHANG Ru-bo. Novel Semi-supervised Clustering for High Dimensional Data [J]. Computer Science, 2010, 37(7): 205-207.
[12] LI Lin-na,CHEN Hai-rui,WANG Ying-long. Semi-supervised Clustering of Complex Structured Data Based on Higher-order Logic [J]. Computer Science, 2009, 36(9): 196-200.
[13] LU Wei-zhou ,YU Shun-zheng (Department of Electronic and Communication Engineering,Sun Yat-Sen University,Guangzhou 510275,China). [J]. Computer Science, 2009, 36(2): 90-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!