计算机科学 ›› 2018, Vol. 45 ›› Issue (1): 62-66.doi: 10.11896/j.issn.1002-137X.2018.01.009

• CRSSC-CWI-CGrC-3WD 2017 • 上一篇    下一篇

基于动态邻域的三支聚类分析

王平心,刘强,杨习贝,米据生   

  1. 江苏科技大学理学院 江苏 镇江212003;河北师范大学数学与信息科学学院 石家庄050024,江苏科技大学计算机科学学院 江苏 镇江212003,江苏科技大学计算机科学学院 江苏 镇江212003,河北师范大学数学与信息科学学院 石家庄050024
  • 出版日期:2018-01-15 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金资助

Three-way Clustering Analysis Based on Dynamic Neighborhood

WANG Ping-xin, LIU Qiang, YANG Xi-bei and MI Ju-sheng   

  • Online:2018-01-15 Published:2018-11-13

摘要: 目前,大多数聚类方法是二支聚类,即对象要么属于一个类,要么不属于一个类,聚类的结果必须具有清晰的边界。然而,将某些不确定的对象强制分配到某个类中将降低聚类结果的结构和精度。三支聚类是一种重叠聚类,它采用核心域和边界域来表示每个类别,较好地处理了具有不确定性对象的聚类问题。提出了一种使用样本邻域将二支聚类转化为三支聚类的方法。该方法利用二支聚类的结果和每个类中元素的邻域是否完全包含在该类中来对集合进行收缩,同时利用不在该类中的元素的邻域是否与该类有交集来进行扩张。收缩的区域称为核心域,扩张域和核心域的差集称为边界域。在UCI数据集上的实验结果显示,该方法在提高聚类结果的结构和F1值方面有较好的效果。

关键词: 三支聚类,邻域,K-means 聚类,谱聚类

Abstract: Most of the existing clustering methods are two-way clustering,which are based on the assumption that a cluster must be represented by a set with crisp boundary.However,assigning uncertain points into a cluster will reduce the accuracy of the method.Three-way clustering is an overlapping clustering which describes each cluster by core region and fringe region.This paper presented a strategy for converting a two-way cluster to three-way cluster using the neighborhood of the samples.In the proposed method,a two-way cluster is shrunk according to whether the neighborhood of sample are contained in this cluster and it is stretched according to whether the neighborhood of sample intersects with this cluster.The shrunk result is called core region and the difference between the shrunk result and stretched result is regarded as the fringe region.Experiment using the proposed method on UCI data sets shows that this strategy is effective in improving the structure and F1 values of clustering results.

Key words: Three-way clustering,Neighborhood,K-means clustering,Spectral clustering

[1] ZADEH L A.Fuzzy sets and information granulation.Advances in fuzzy set theory and applications[M].Amsterdam:North-Holland Publishing,1979:35-48.
[2] SUN J G,LIU J,ZHAO L Y.Clustering algorithms research[J].Journal of Software,2008,19(1):48-61.(in Chinese) 孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1):48-61.
[3] ELALAMI M E.Supporting image retrieval framework withrule base system[J].Knowledge-Based Systems,2011,24(2):331-340.
[4] MARTIN-GUERRERO J D,PALOMARES A,BALAGUER-BALLESTER E,et al.Studying the feasibility of a recommender in a citizen web portal based on user modeling and clustering algorithms [J].Expert Systems with Applications,2006,30(2):299-312.
[5] KALYANI S,SWARUP K S.Particle swarm optimization based k-means clustering approach for security assessment in power systems [J].Expert Systems with Applications,2011,38(9):10839-10846.
[6] SHI J L,LUO Z G.Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples [J].Computers in Biology & Medicine,2010,40(8):723-732.
[7] SEBISKVERADZE D,VRABIE V,GOBINET C,et al.Automation of an algorithm based on fuzzy clustering for analyzing tumoral heterogeneity in human skin carcinoma tissue sections [J].Laboratory Investigation,2011,91(5):799-811.
[8] HOPPNER F,KLAWONN F,KRUSE R,et al.Fuzzy clusteranalysis:methods for classification,data analysis and image re-cognition [M].Chichester:Wiley Press,1999:1-48.
[9] LINGRAS P.Rough K-Medoids clustering using Gas [C]∥Proceedings of the 8th IEEE International Conference on Cognitive Informatics.Hong Kong:IEEE Press,2009:315-319.
[10] LINGRAS P,HOGO M,SNOREK M.Interval set clustering of web users using modified Kohonen self-organizing maps based on the properties of rough sets [J].Web Intelligence and Agent Systems:An International Journal,2004,2(3):217-230.
[11] LINGRAS P,HOGO M,SNOREK M,et al.Temporal analysis of clusters of supermarket customers:conventional versus interval set approach [J].Information Sciences,2005,172(1/2):215-240.
[12] LINGRAS P,WEST C.Interval set clustering of web users with rough K-Means [J].Journal of Intelligent Information Systems,2004,23(1):5-16.
[13] YAO Y Y,LINGRAS P,WANG R Z,et al.Interval Set Cluster Analysis:A Re-formulation [C]∥International Conference on Rough Sets,Fuzzy Sets,Data Mining and Granular Computing.Delhi:Springer,2009:398-405.
[14] YAO Y Y.Three-way decisions with probabilistic rough sets[J].Information Sciences,2010,180(3):341-353.
[15] YAO Y Y.The superiority of three-way decisions in probabilistic rough set models[J].Information Sciences,2011,181(6):1086-1096.
[16] YAO Y Y.An Outline of a Theory of Three-Way Decisions[C]∥International Conference on Rough Sets and Current Trends in Computing.Berlin:Springer,2012:1-17.
[17] YU H,CHU S S,YANG D C.Autonomous knowledge-oriented clustering using decision-theoretic rough set theory [J].Fundamenta Informaticae,2012,115(2-3):141-156.
[18] YU H,LIU Z G,WANG G Y.An automatic method to determine the number of clusters using decision-theoretic rough set [J].International Journal of Approximate Reasoning,2014,55(1):101-115.
[19] YU H,ZHANG C,WANG G Y.A tree-based incremental overlapping clustering method using the three-way decision theory [J].Knowledge-Based Systems,2016,91(C):189-203.
[20] YU H,JIAO P,YAO Y Y,et al.Detecting and refining overlapping regions in complex networks with three-way decisions [J].Information Sciences,2016,373(1):21-41.
[21] MACQUEEN J B.Some methods for classification and analysis of multivariate observations [C]∥Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability.Berkeley:University of California Press,1967:281-297.
[22] PERONA P,FREEMAN W T.A Factorization Approach toGrouping [C]∥European Conference on Computer Vision.Berlin:Springer,1998:655-670.
[23] SHI J,MALIK J.Normalized cuts and image segmentation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
[24] SCOTT G L,LONGUET-HIGGINS H C.Feature grouping by relocalisation of eigenvectors of proximity matrix [C]∥Procee-dings of British Machine Vision Conference.Oxford:BMVA Press,1990:103-108.
[25] NG A,JORDAN M,WEISS Y.On spectral clustering:analysis and an algorithm[C]∥International Conference on Neural Information Processing Systems:Natural and Synthetic.Shanghai:MIT Press,2001:849-856.
[26] UCI machine Learning Repository .http://www.ics.uci.edu/mlearn/MLRepository.html.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!