计算机科学 ›› 2016, Vol. 43 ›› Issue (Z11): 447-450.doi: 10.11896/j.issn.1002-137X.2016.11A.100

• 信息安全 • 上一篇    下一篇

一种数据挖掘中的W-PAM限制聚类算法

张松,张琳   

  1. 南京邮电大学计算机学院 南京210003,南京邮电大学计算机学院 南京210003
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61402241,61572260,61373017,61572261,61472192),江苏省科技支撑计划(BE2015702)资助

W-PAM Restricted Clustering Algorithm in Data Mining

ZHANG Song and ZHANG Lin   

  • Online:2018-12-01 Published:2018-12-01

摘要: 在数据挖掘中由于每个数据对象对于知识发现的作用是不同的,为了区分这些相异之处,给每个对象赋予一定量的值,因此在PAM聚类算法的基础上提出一种W-PAM(Weight Partitioning Around Medoids)聚类算法,它为簇中数据对象加入权重来提高算法的准确率,此外利用数据对象间的关联限制能够提高聚类算法的效果。探讨了一种W-PAM算法与关联限制相结合的限制聚类算法,该算法同时拥有W-PAM算法和关联限制的优点。实验结果证明,W-PAM的限制聚类算法可以更有效地利用所给的关联限制来改善聚类效果,提高算法的准确率。

关键词: 数据挖掘,W-PAM,关联限制,限制聚类

Abstract: In data mining,the effect of each data object on knowledge discovery is different.In order to distinguish these differences,this paper gave a certain amount of value to each object,and put forward a W-PAM (Weight Partitioning Around Medoids) clustering algorithm which is based on the PAM algorithm.It can improve the accuracy of the algorithm by adding weight to the data object in the cluster.Moreover,the effect of clustering algorithm can be improved by using the association among the data objects.In this paper,a W-PAM restricted clustering algorithm was proposed,which combines the W-PAM algorithm with the constraint clustering algorithm.The algorithm has advantages of the W-PAM restricted clustering algorithm and relevance constraints.The experimental results show that the W-PAM restricted clustering algorithm can effectively improve the clustering result and improve the accuracy of the algorithm.

Key words: Data mining,W-PAM,Association restriction,Restricted clustering

[1] 孙富贵,刘杰,赵连宁.聚类算法研究[J].软件学报,2008,9(1):48-61
[2] Vapnik V.Statistical Learning Theory [M].New York:John Wiley,1998
[3] MacQueen J.Some methods for classification and analysis ofmultivariate observations [C]∥Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability.Berkeley:University of California Press,1967:281-297
[4] Park H S,Jun C H.A simple and fast algorithm for K-medoids clustering [J].Expert Systems with Applications,2009,36(2):3336-3341
[5] 何萍,徐晓华,陆林,等.双层随机游走半监督聚类[J].软件学报,2014,5(5):997-1013
[6] 马儒宁,王秀丽,丁军娣.多层核心集凝聚算法[J].软件学报,2013,4(3):490-506
[7] Zhou Y,Wang X,Wang T,et al.Fault-tolerant multi-path routing protocol for WSN based on HEED[J].International Journal of Sensor Networks,2016,20(1):37
[8] 陈克寒,韩盼盼,吴健.基于用户聚类的异构社交网络推荐算法[J].计算机学报,2013,6(2):349-359
[9] 刘卓,杨悦,张健沛,等.不确定度模型下数据流自适应网格密度聚类算法[J].计算机研究与发展,2014,1(11):2518-2527
[10] Wagstaff K,Cardie C.Clustering with instance- level constraints[C]∥Proc of the 17th International Conference on Machine Learning (ICML-2000).2000:1103-1110
[11] Sun Jun,Zhao Wen-bo.Xue Jiangwei et al.Clustering with feature order preferences [J].Intelligent Data Analysis,2010,14(4):479-495
[12] M Law,A Topchy,A Jain.Clustering with Soft and Group Constraints [C]∥Proc of Joint IAPR Int’l Workshop on Structural Syntactic and Statistical Pattern Recognition.2004:662-670
[13] Wagstaff K,Cardie C,Rogers S,et al.Constrained K-meansClustering with Background Knowledge[C]∥Proc of the 18th International Conference on Machine Learning (ICML-2001).2001:577-584
[14] Klein D,Kamvar S,Manning C.Form Instance-level Constraints to Space-Level Constraints:Making the Most of Prior Knowledge in Data Clustering[C]∥Proc of the 19 International Conference on Machine Learning (ICML-2002).2002:307-314
[15] 韩家炜,堪博著.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2007:251-267
[16] 刘正,张国印,陈志远.基于特征加权和非负矩阵分解的多视角聚类算法[J].电子学报,2016,4(3):536-540

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!