计算机科学 ›› 2017, Vol. 44 ›› Issue (4): 241-245.doi: 10.11896/j.issn.1002-137X.2017.04.051

• 人工智能 • 上一篇    下一篇

基于自然邻居和最小生成树的原型选择算法

朱庆生,段浪军,杨力军   

  1. 重庆大学计算机学院 重庆400044,重庆大学计算机学院 重庆400044,重庆大学计算机学院 重庆400044
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金(61272194)资助

Prototype Selection Algorithm Based on Natural Neighbor and MST

ZHU Qing-sheng, DUAN Lang-jun and YANG Li-jun   

  • Online:2018-11-13 Published:2018-11-13

摘要: K最近邻居是最流行的有监督分类算法之一。然而,传统的K最近邻居有两个主要的问题:参数K的选择以及在大规模数据集下过高的时间和空间复杂度需求。为了解决这些问题,提出了一种新的原型选择算法,它保留了一些对分类贡献很大的关键原型点,同时移除噪声点和大多数对分类贡献较小的点。不同于其他原型选择算法,该算法使用了自然邻居这个新的邻居概念来做数据预处理,然后基于设定的终止条件构建若干个最小生成树。基于最小生成树,保留边界原型,同时生成一些具有代表性的内部原型。基于UCI基准数据集进行实验,结果表明提出的算法有效地约简了原型的数量,同时保持了与传统KNN相同水平的分类准确率;而且,该算法在分类准确率和原型保留率上优于其他原型选择算法。

关键词: K最近邻居,原型选择,自然邻居,最小生成树,分类

Abstract: K-nearest neighbor (KNN) is one of the most popular algorithms for supervised classification.However,the traditional KNN classification has two limitations that the option of parameter K and prohibitive computational and storage demand for large datasets.To overcome these limitations,a new prototype selection algorithm was proposed,which retains some key prototypes that make a large contribution to classification and removes the most of other prototypes with little contribution for classification.Differing from other prototype selection algorithms,the proposal uses a novel neighbor concept natural neighbor to preprocess the dataset and builds minimum spanning tree based on the speci-fic terminal conditions.According to the MST,the prototypes close to boundaries and some internal prototypes are preserved.Experimental results show that the proposed algorithm effectively reduces the number of prototypes while maintaining the same level of classification accuracy as the traditional KNN classification algorithm.Moreover,it is a little bit superior to other prototype selection algorithms in classification accuracy and retention ratio.

Key words: K-nearest neighbor,Prototype selection,Natural neighbor,Minimum spanning tree,Classification

[1] FAYED H,ATIYA A F.A Novel Template Reduction Approach for the-Nearest Neighbor Method[J].IEEE Transactions on Neural Networks,2009,20(5):890-896.
[2] LI J,WANG Y.A new fast reduction technique based on binary nearest neighbor tree[J].Neurocomputing,2015,149:1647-1657.
[3] OLVERA-LPEZ J A,CARRASCO-OCHOA J A,MARTNEZ-TRINIDAD J F.A new fast prototype selection method based on clustering[J].Pattern Analysis and Applications,2010,13(2):131-141.
[4] HART B P E.The condensed nearest neighbor rule[J].IEEE Trans Information Theory,1968,14:515-516.
[5] CHOU C H,KUO B H,CHANG F.The generalized condensed nearest neighbor rule as a data reduction method[C]∥18th International Conference on Pattern Recognition,2006(ICPR 2006).IEEE,2006:556-559.
[6] ANGIULLI F.Fast nearest neighbor condensation for large data sets classification[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(11):1450-1464.
[7] WILSON D L.Asymptotic properties of nearest neighbor rules using edited data[J].IEEE Transactions on Systems Man and Cybernetics,1972,SMC-2(3):408-421.
[8] WILSON D R,MARTNEZ T R.Reduction techniques for instance-based learning algorithms[J].Mach Learn,2000,38(3):257-286
[9] ZOU X,ZHU Q,JIN Y.An adaptive neighborhood graph for LLE algorithm without free-parameter[J].International Journal of Computer Applications,2011,16(2):20-23.
[10] ZOU X L,ZHU Q S.Abnormal structure in regular data re-vealed by Isomap with natural nearest neighbor[M]∥Advances in Computer Science,Environment,Ecoinformatics,and Education.Springer Berlin Heidelberg,2011:538-544.
[11] ZHU Q S,ZHANG Y,LIU H J.Classification Algorithm Based on Natural Nearest Neighbor[J].Journal of Information and Computetional Science,2015,12(2):573-580.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!