计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220400127-7.doi: 10.11896/jsjkx.220400127

• 大数据&数据科学 • 上一篇    下一篇

基于DBSCAN的动态邻域密度聚类算法

张朋, 李小林, 王李妍   

  1. 中国矿业大学矿业工程学院 江苏 徐州 221003
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 李小林(xlli@cumt.edu.cn)
  • 作者简介:(905929036@qq.com)
  • 基金资助:
    国家自然科学基金(71401164)

Dynamic Neighborhood Density Clustering Algorithm Based on DBSCAN

ZHANG Peng, LI Xiaolin, WANG Liyan   

  1. College of mines,China University of Mining and Technology,Xuzhou,Jiangsu 221003,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:ZHANG Peng,born in 1991,postgra-duate.His main research interests include data analysis and processing. LI Xiaolin,born in 1986,Ph.D,professor.His main research interests include enterprise management informatization and system integration.
  • Supported by:
    National Natural Science Foundation of China(71401164).

摘要: 传统的密度聚类算法在聚类划分时不会考虑数据点间的属性差异,它将所有数据点都看成同质化的点。对此,在DBSCAN算法的基础上,提出了一种动态邻域密度聚类算法DN-DBSCAN(Dynamic Neighborhood-Density Based Spatial Clustering of Applications with Noise)。该算法在聚类时由样本点的属性决定其自身的邻域半径,因此各点的邻域半径是动态变化的,由此可将具有不同属性的点对集群产生的不一样的影响力体现在聚类结果之中,使密度聚类算法更具有现实意义。在算例分析的基础上,针对长三角城市群划分问题应用所提DN-DBSCAN算法进行分析求解,并对比分析DBSCAN算法、OPTICS算法和DPC算法的求解效果。结果显示,DN-DBSCAN算法能根据各城市属性的不同合理地划分出长三角城市群,准确率为95%,准确率分别高于上述3种对比算法85%,85%,88%,说明其具有更好的解决实际问题的能力。

关键词: 动态邻域, 密度聚类, 动态邻域密度聚类, 属性差异, 划分准确率

Abstract: The traditional density clustering algorithms do not consider the attribute difference between data points in the clustering process,but treat all data points as homogenous points.Based on the traditional DBSCAN algorithm,a dynamic neighborhood--density based spatial clustering of applications with noise(DN-DBSCAN) is proposed.When it is working,each point’s neighborhood radius is determined by the properties of itself,so the neighborhood radius is dynamic changing.Thus,different influences on datasets produced by points with different properties is reflected in the clustering results,making the density clustering algorithm has more practical meaning and can be more reasonable to solve practical problems.On the basis of example analysis,the DN-DBSCAN algorithm is applied to solve the urban agglomeration division problem in the Yangtze river delta,and the results of DBSCAN algorithm,OPTICS algorithm and DPC algorithm are compared and analyzed.The results show that DN-DBSCAN algorithm can reasonably classify urban agglomerations in the Yangtze river delta according to the different attributes of each city with an accuracy of 95%,which is much higher than the accuracy of 85%,85% and 88% of the other three algorithms respectively,indicating that it has a better ability to solve practical problems.

Key words: Dynamic neighborhood, Density clustering, Dynamic neighborhood density clustering, Attribute differences, Division accuracy

中图分类号: 

  • TP301
[1]BEHARA K N S,BHASKAR A,CHUNG E.A DBSCAN-based framework to mine travel patterns from origin-destination matrices:Proof-of-concept on proxy static OD from Brisbane[J].Transportation Research Part C:Emerging Technologies,2021,131:103370.
[2]CAI Y K,XIE K Q,MA X J.An Improved DBSCAN Algorithmwhich is Insensitive to Input Parameters[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2004,40(3):480-486.
[3]FENG Z H,QIAN X Z,ZHAO N N.Greedy DBSCAN:an improved DBSCAN algorithm on multi-density clustering[J].Application Research of Computers,2016,33(9):2693-2696,2700.
[4]CHEN X H,XI Q G.Research and Implementation of Adaptive Clustering Algorithm based on DBSCAN[J].Journal of Huaiyin Teachers College(Natural Science Edition),2021,20(3):228-234.
[5]ZHOU H,WANG P,LI H Y.Research on Adaptive Parameters Determination in DBSCAN Algorithm[J].Journal of Information & Computational Science,2012,9(7):1967-1973.
[6]YUE S H,LI P,GUO J D,et al.A statistical information-based clustering approach in distance space[J].Journal of Zhejiang University Science,2005,6(1):71-78.
[7]WANG R M.Urban agglomeration development and housingdemand:a literature review[J].Shanghai Real Estate,2021(9):8-12.
[8]YU W X.Opportunities and challenges of the development of urban agglomerations empowered by technology[J].Gover-nance,2021,(31):25-29.
[9]WANG W,ZHU X C,WANG Y.Evolution and knowledge map analysis of Urban agglomeration research in China[J].Beijing Planning Review,2020(3):74-79.
[10]XIAO J C.The Developing Stage of and Function Orientation of Ten Chinese Urban Cluster[J].Reform,2009(9):5-23.
[11]YAO S M.Urban agglomeration in China[M].Hefei:Universityof Science and Technology of China Press,2001.
[12]YAO S M,ZHOU C S,WANG D.New theory of Urban ag-glomeration in China[M].Beijing:Science Press,2016.
[13]ZHOU Y X,XU X Q.Urban geography(2th ed)[M].Beijing:Beijing Higher Education Press,2009.
[14]HUANG Z X.Study on the standard of urban agglomerationdefinition[J].Inquiry into Economic Issues,2014(8):156-164.
[15]GOTTMANN J.Megalopolis or the Urbanization of the Northeastern Seaboard[J].Economic Geography,2016,33(3):189-200.
[16]ZHANG J.Interpretation of The Development Plan of Yangtze River Delta Urban Agglomeration[J].Education of Geography,2017(2):62-63.
[17]Office of the Seventh National Census Leading Group of The State Council.Key data from the seventh National Census in 2020[M].Beijing:China Statistics Press,2021.
[18]National Bureau of Statistics.GDP data by region in 2020[EB/OL].(2021-01-29)[2022-01-15].http://www.stats.gov.cn/tjsj/.
[19]BIRANT D,KUT A.ST-DBSCAN:An algorithm for clustering spatial-temporal data[J].Data & Knowledge Engineering,2007,60(1):208-221.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!