计算机科学 ›› 2016, Vol. 43 ›› Issue (Z6): 236-238.doi: 10.11896/j.issn.1002-137X.2016.6A.057

• 模式识别与图像处理 • 上一篇    下一篇

基于相对密度的孤立点和边界点识别算法

李光兴   

  1. 成都农业科技职业学院 成都611130
  • 出版日期:2018-11-14 发布日期:2018-11-14

Recognition Algorithm of Outlier and Boundary Points Based on Relative Density

LI Guang-xing   

  • Online:2018-11-14 Published:2018-11-14

摘要: 根据孤立点是数据集合中与大多数数据的属性不一致的数据,边界点是位于不同密度数据区域边缘的数据对象,提出了基于相对密度的孤立点和边界点识别算法(OBRD)。该算法判断一个数据点是否为边界点或孤立点的方法是:将以该数据点为中心、r为半径的邻域按维平分为2个半邻域,由这些半邻域与原邻域的相对密度确定该数据点的孤立度和边界度,再结合阈值作出判断。实验结果表明,该算法能精准有效地对多密度数据集的孤立点和聚类边界点进行识别。

关键词: 邻域,密度,孤立度,孤立点,边界度,边界点

Abstract: According to the fact that outlier points are the data that are inconsistent with most of data in a data set,and that boundary points are located on the edge of data area with different densities,an algorithm based on relative density was proposed to determine the outlier and boundary points.Through dividing the neighborhood area,which is centered by this point with a radius of r,into two semi-neighborhood areas,and determining this data point’s isolation level and boundary level based on the relative density of these semi-neighborhood areas with the original neighborhood area,a final judgment whether a data point is boundary or outlier point can be made according to the threshold value.Experimental results indicate that this algorithm can effectively and accurately identify the outlier and boundary points from multi-density data sets.

Key words: Neighborhood,Density,Isolation level,Outlier,Boundary level,Boundary points

[1] Branch J W,Giannella C,Szymanski B,et al.In-network outlier detection in wireless sensor networks[J].Knowledge and Information Systems,2013,34(1):23-54
[2] 贾润达,刘俊豪,毛志忠,等.基于鲁棒M估计的间歇过程离群点检测[J].仪器仪表学报,2013,4(8):1726-1729
[3] 黄毅群,卢正鼎,胡和平,等.分布式异常识别中隐私保持问题研究[J].电子学报,2006,34(5):796-799
[4] Niu Z,Shi S,Sun J,et al.A survey of outlier detection methodo-logies and their applications[M]∥Artificial Intelligence and Computational Intelligence.Springer Berlin Heidelberg,2011:380-387
[5] Zoubi M B A,Obeid N.A Fast Distance Algorithm to Detect Outliers[J].Journal of Computer Science,2007,3(12):944-947
[6] Zhang Yue,Yang Xue-hua,Li Huang.An Outlie Mining Algorithm Based on Confidence Interval [C]∥Proc.of the 2nd IEEE International Conference on Information Management and Engineering.IEEE Press,2010
[7] Bhaduri K,Matthews B L,Giannella C R.Algorithms for speeding up distance-based outlier detection [C]∥Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2011:859-867
[8] Keller F,Müller E,Bhm K.HiCS:high contrast subspaces for density-based outlier ranking[C]∥2012 IEEE 28th Internatio-nal Conference on Data Engineering (ICDE).IEEE,2012:1037-1048
[9] Aggarwal C C,Philip S Y.Outlier Detection with Uncertain Data[C]∥SDM.2008:483-493
[10] 李光兴,杨燕.基于网格相邻关系的离异点识别算法[J].计算机工程与科学,2010,32(9):130-133
[11] 赵峰,秦锋.基于单元的孤立点识别算法改进及应用[J].计算机工程,2009,5(19):78-80
[12] 张选平,祝兴昌,马琮.一种基于边界识别的聚类算法[J].西安交通大学学报,2007,41(12):1387-1390
[13] 楼晓俊,孙雨轩,刘海涛.聚类边界过采样不平衡数据分类方法[J].浙江大学学报(工学版),2013,47(6):944-949
[14] Xia C,Hsu W,Lee M L,et al.BORDER:efficient computation of boundary points[J].IEEE Transactions on Knowledge and Data Engineering,2006,8(3):289-303
[15] 吾守尔·斯拉木,李丰军,陶梅.IBORA:一种改进的有效的边界点检测[J].小型微型计算机系统,2008,29(10):1845-1848
[16] 邱保志,张枫,岳峰.基于统计信息的聚类边界模式识别算法[J].计算机工程,2008,34(3):91-93
[17] Li G,Li B.Boundary Point Recognition Algorithm Based onGrid Adjacency Relation[M]∥Recent Advances in Computer Science and Information Engineering.Springer Berlin Heidelberg,2012:211-218
[18] 张鸿雁,刘希玉,付萍.一种网格聚类的边缘识别算法[J].控制与决策,2011,26(12):1846-1850
[19] 邱保志,余田.基于网格梯度的边界点识别算法的研究[J].微电子学与计算机,2008,25(3):77-80

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!