Computer Science ›› 2023, Vol. 50 ›› Issue (6): 100-108.doi: 10.11896/jsjkx.220800074

• Granular Computing & Knowledge Discovery • Previous Articles     Next Articles

Three-way DBSCAN Algorithm Based on Local Eps

SHEN Qiuping1, ZHANG Qinghua1,2, GAO Man1, DAI Yongyang1   

  1. 1 Chongqing Key Laboratory of Computational Intelligence,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2 Key Laboratory of Tourism Multisource Data Perception and Decision,Ministry of Culture and Tourism,Chongqing University of Posts and Telecommunitions,Chongqing 400065,China
  • Received:2022-08-08 Revised:2022-11-21 Online:2023-06-15 Published:2023-06-06
  • About author:SHEN Qiuping,born in 1997,postgra-duate.Her main research interests include three-way decisions,machine learning and uncertain information processing.ZHANG Qinghua,born in 1974.Ph.D,professor,Ph.D supervisor.His main research interests include rough sets,fuzzy sets,granular computing and uncertain information processing.
  • Supported by:
    National Key Research and Development Program of China(2020YFC2003502),National Natural Science Foundation of China(61876201) and Talented Masters and Teachers Project of Chongqing(CQYC20210202215).

Abstract: Density-based spatial clustering of applications with noise(DBSCAN) is a classical density-based clustering algorithm.It clusters dataset of arbitrary shape through two global parameters:radius Eps and minimum number of points MinPts,and automatically determines the number of clusters.However,DBSCAN with global Eps has poor clustering effect on datasets with uneven density,and cannot cluster overlapping datasets.Therefore,the clustering principle of decreasing density and local Eps are defined in this paper.The local Eps is automatically determined according to the k-nearest neighbor distance,and the DBSCAN algorithm based on local Eps(LE-DBSCAN) is proposed.Then,by considering the labels of the nearest neighbors,the border points and noises in the two-way clustering results are redivided,and the three-way DBSCAN algorithm based on local Eps(LE3W-DBSCAN) is proposed.LE-DBSCAN and LE3W-DBSCAN are compared with the related algorithms in this field on UCI datasets and artificial datasets.Experimental results show that the proposed algorithm has good performance both in common hard clustering indices and soft clustering indices.

Key words: Three-way clustering, DBSCAN, Local Eps, Multi-density, Overlapping datasets

CLC Number: 

  • TP391
[1]DAYANAM A,ELBA C T,MARCEL B,et al.Cluster analysis of urban ultrafine particles size distributions[J].Atmospheric Pollution Research,2019,10(1):45-52.
[2]WU X D,ZHU X Q,WU G Q,et al.Data mining with big data[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(1):97-107.
[3]HUANG J J,CHEN W,LIU A,et al.Cluster query:a new query pattern on temporal knowledge graph[J].World Wide Web,2020,23(2):755-779.
[4]BROWN D,JAPA A,SHI Y.A fast density-grid based clustering method[C]//IEEE 9th Annual Computing and Communication Workshop and Conference.Piscataway,NJ:IEEE,2019:48-54.
[5]GONDEAU A,AOUABED Z,HIJRI M,et al.Object weigh-ting:A new clustering approach to deal with outliers and cluster overlap in computational biology[J].IEEE ACM Transactions on Computational Biology and Bioinformatics,2021,18(2):633-643.
[6]XU C Y,LIN R J,CAI J Y,et al.Deep image clustering by fusing contrastive learning and neighbor relation mining[J].Knowledge Based Systems,2022,238:107967.
[7]SINGH S,GANIE A H.Applications of picture fuzzy similaritymeasures in pattern recognition,clustering,and MADM[J].Expert Systems with Applications,2021,168:114264.
[8]XU Z Z,SHEN D R,NIE T Z,et al.A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data[J].Information Sciences,2021,572:574-589.
[9]WANG X,WANG J,YU G X,et al.Network regularized bi-clustering for cancer subtype categorization[J].Chinese Journal of Computers,2019,42(6):1274-1288.
[10]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496.
[11]LIU R,WANG H,YU X M.Shared-nearest-neighbor-basedclustering by fast search and find of density peaks[J].Information Sciences,2018,450:200-226.
[12]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.Portland,Oregon:AAAI Press,1996:226-231.
[13]JAIN A K.Data clustering:50 years beyond k-means[J].Pattern Recognition Letters,2010,31(8):651-666.
[14]D'URSO P.Informational Paradigm,management of uncertainty and theoretical formalisms in the clustering framework:A review[J].Information Sciences,2017,400:30-62.
[15]PETERS G,CRESPO F A,LINGRAS P,et al.Soft clustering-Fuzzy and rough approaches and their extensions and derivatives[J].International Journal of Approximate Reasoning,2013,54(2):307-322.
[16]YU H,JIAO P,YAO Y Y,et al.Detecting and refining overlapping regions in complex networks with three-way decisions[J].Information Sciences,2016,373:21-41.
[17]YU H,ZHANG C,WANG G Y.A tree-based incremental overlapping clustering method using the three-way decision theory[J].Knowledge Based Systems,2016,91:189-203.
[18]WANG P X,YAO Y Y.CE3:A three-way clustering methodbased on mathematical morphology[J].Knowledge Based Systems,2018,155:54-65.
[19]ZHANG K.A three-way c-means algorithm[J].Applied Soft Computing,2019,82:105536.
[20]MAJI P,PAL S K.RFCM:A hybrid clustering algorithm using rough and Fuzzy sets[J].Fundamenta Informaticae,2007,80(4):475-496.
[21]HU L H,LIU H K,ZHANG J F,et al.KR-DBSCAN:A density-based clustering algorithm based on reverse nearest neighbor and influence space[J].Expert Systems With Applications,2021,186:115763.
[22]CHEN F Y.An improved DBSCAN algorithm for adaptively determining parameters in multi-density environment[C]//2nd International Conference on Artificial Intelligence and Information Systems.New York:ACM,2021:1-4.
[23]YU H,CHEN L Y,YAO J T,et al.A three-way clusteringmethod based on an improved DBSCAN algorithm[J].Physica A:Statistical Mechanics and its Applications,2019,535:122289.
[24]GIONIS A,MANNILA H,TSAPARAS P.Clustering aggregation[J].ACM Transactions on Knowledge Discovery from Data,2007,1(1):1-30.
[25]ZAHN C T.Graph-theoretical methods for detecting and describing gestalt clusters[J].IEEE Transactions on Computers,1971,100(1):68-86.
[26]CHANG H,YEUNG D Y.Robust path-based spectral cluste-ring[J].Pattern Recognition,2008,41(1):191-203.
[27]JAIN A,LAW M.Data clustering:A user's dilemma[J].Lecture Notes in Computer Science,2005,3776:1-10.
[28]FU L M,MEDICO E.FLAME,a novel fuzzy clustering method for the analysis of DNA microarray data[J].BMC bioinforma-tics,2007,8(1):3-3.
[29]HOU J,ZHANG A H,QI N M.Density peak clustering based on relative density relationship[J].Pattern Recognition,2020,108:107554.
[30]XIE J Y,GAO H C,XIE W X,et al.Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors[J].Information Sciences,2016,354:19-40.
[1] CHEN Jie. Study on Long Text Topic Clustering Based on Doc2Vec Enhanced Features [J]. Computer Science, 2023, 50(6A): 220800192-6.
[2] MA Xin-yu, JIANG Chun-mao, HUANG Chun-mei. Optimal Scheduling of Cloud Task Based on Three-way Clustering [J]. Computer Science, 2022, 49(11A): 211100139-7.
[3] QI Ying, CHAI Yan-mei. High-resolution Remote Sensing Sea Ice Image Segmentation Based on Combination of ImprovedSLIC Algorithm and Clustering Algorithm [J]. Computer Science, 2022, 49(11A): 211200100-6.
[4] ZHANG Ren-jie, CHEN Wei, HANG Meng-xin, WU Li-fa. Detection of Abnormal Flow of Imbalanced Samples Based on Variational Autoencoder [J]. Computer Science, 2021, 48(7): 62-69.
[5] LUO Jin-nan and ZHANG Ji-min. Rail Area Extraction Using Extended Haar-like Features and DBSCAN Clustering [J]. Computer Science, 2020, 47(6A): 153-156.
[6] DENG Ding-sheng. Application of Improved DBSCAN Algorithm on Spark Platform [J]. Computer Science, 2020, 47(11A): 425-429.
[7] ZHANG Jian-xin, LIU Hong, LI Yan. Efficient Grouping Method for Crowd Evacuation [J]. Computer Science, 2019, 46(6): 231-238.
[8] HU Ying-shuang, LU Yi-hong. Cell Clustering Algorithm Based on MapReduce and Strongly Connected Fusion [J]. Computer Science, 2019, 46(11A): 204-207.
[9] WU Jian-wei, LI Yan-ling, ZHANG Hui, ZANG Han-lin. HMM Cooperative Spectrum Prediction Algorithm Based on Density Clustering [J]. Computer Science, 2018, 45(9): 129-134.
[10] WANG Ping-xin, LIU Qiang, YANG Xi-bei and MI Ju-sheng. Three-way Clustering Analysis Based on Dynamic Neighborhood [J]. Computer Science, 2018, 45(1): 62-66.
[11] YANG Fang-xun. Application of DBSCAN Algorithm in Electronic Mail Network Community Detection [J]. Computer Science, 2017, 44(Z6): 591-593.
[12] HUANG Ming-ji and ZHANG Qian. Design and Implementation of Parallel DBSCAN Algorithm Based on Spark [J]. Computer Science, 2017, 44(Z11): 524-529.
[13] WEI Fang-yuan and HUANG De-cai. UID-DBSCAN Clustering Algorithm of Multi-dimensional Uncertain Data Based on Interval Number [J]. Computer Science, 2017, 44(Z11): 442-447.
[14] MA Chun-lai, SHAN Hong and MA Tao. Improved Density Peaks Based Clustering Algorithm with Strategy Choosing Cluster Center Automatically [J]. Computer Science, 2016, 43(7): 255-258.
[15] LAI Li-ping, NIE Rui-hua, WANG Jiang-ping and HUANG Jia-hong. Improved DBSCAN Algorithm Based on MapReduce [J]. Computer Science, 2015, 42(Z11): 396-399.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!