计算机科学 ›› 2014, Vol. 41 ›› Issue (2): 166-169.

• CCML 2013 • 上一篇    下一篇

SMwKnn:基于类别子空间距离加权的互k近邻算法

卢伟胜,郭躬德,严宣辉,陈黎飞   

  1. 福建师范大学数学与计算机科学学院 福州350007;福建师范大学数学与计算机科学学院 福州350007;福建师范大学数学与计算机科学学院 福州350007;福建师范大学数学与计算机科学学院 福州350007
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61070062,61175123),福建高校产学合作科技重大项目(2010H6007)资助

SMwKnn:Mutual k Nearest Neighbours Algorithm Based on Class Subspace and Distance-weighted

LU Wei-sheng,GUO Gong-de,YAN Xuan-hui and CHEN Li-fei   

  • Online:2018-11-14 Published:2018-11-14

摘要: 互k最近邻算法(mKnnc)是k最近邻分类算法(Knn)的一种改进算法,该算法用互k最近邻原则对训练样本以及k最近邻进行噪声消除,从而提高算法的分类效果。然而在利用互k最近邻原则进行噪声消除时,并没有将类别属性考虑进去,因此有可能把真实有效的数据当成噪声消除掉,从而影响分类效果。基于类别子空间距离加权的互k最近邻算法考虑到近邻的距离权重,既能消除冗余或无用属性对最近邻分类算法依赖的相似性度量的影响,又能较好地消除邻居中的噪声点。最后在UCI公共数据集上的实验结果验证了该算法的有效性。

关键词: 类别子空间,互k最近邻,距离加权,子空间 中图法分类号TP391文献标识码A

Abstract: Mknnc is an improved algorithm of the k nearest neighbours (KNN),which uses the mutual k nearest neighbours to eliminate anomalies in the training set and the k nearest neighbours.It has the better performance than KNN.However,the real and effective data may be eliminated as the noises so that influencing the efficiency of classification in the noise elimination stage without taking the class label into consideration.The mutual k nearest neighbours algorithm based on class subspace and distance-weighted (SMwKnn) taking distance-weighted into account can eliminate the influence of the redundant or useless attributes on the similarity measurement of the k nearest neighbours classification algorithm and eliminate the anomalies in the neighbours.The experimental results on the UCI public datasets verify the effectiveness of the proposed algorithm.

Key words: Class subspace,Mutual k nearest neighbour,Distance weighted,Subspace

[1] 胡元,石冰.基于区域划分的kNN文本快速分类算法研究[J].计算机科学,2012,0:182-186
[2] Mitchell T M.Machine Learning[M].McGraw-Hill Companies Inc,1997:230-247
[3] Cover T M,Hart P E.Nearest Neighbor Pattern Classification[J].IEEE Trans on Informati on Theory,1967,J3(1):21-27
[4] Wu X D,Kumar V,Quinlan J R,et al.Top 10algorithms indata mining[J].Knowl Inf Syst,2008,14:1-37
[5] Dudani S A.The Distance-weighted kNearest Neighbor Rule[J].IEEE Transactions on System,Man and Cybernetics,1976,SMC-6(4):325-327
[6] Liu H W,Zhang S C.Noisy data elimination using mutualk-nearest neighbor for classification mining[J].The Journalof Systems and Software,2012(85):1067-1074
[7] Bhatia N,Vandana.Survey of nearest neighbor techniques[J].International Journal of Computer Science and Information Security,2008,8(2):302-305
[8] 张孝飞,黄河燕.一种采用聚类技术改进的 KNN 文本分类方法[J].模式识别与人工智能,2009,2(6):936-940
[9] 余鹰,苗夺谦,刘财辉,等.基于变精度粗糙集的KNN分类改进算法[J].模式识别与人工智能,2012,5(4):618-623
[10] Guo G D,Wang H,Bell D,et al.KNN Model-Based Approach in Classification[J].Proc of the OTM Confederated International Conference on CoopIS,DOA,and OD BASE.Catania,Italy,2003:986-996
[11] 陈黎飞,郭躬德.最近邻分类的多代表点学习算法[J].模式识别与人工智能,2011,4(6):883-888
[12] Gou J P,Xiong T S,Kuang Y.A Novel Weighted Votingfor K-Nearest Neighbor Rule[J].Journal of Computers,2011,6(5):833-840
[13] Ding C,He X F.K-nearest-neighbor consistency in data-clustering:incorporating local information into global optimization[C]∥Proceedings of ACM Symposium on Applied Computing (SAC).2004:584-589
[14] Chidananda K,Krishna G.The condensed nearest neighbor or rule using the concept of mutual nearest neighbor[J].IEEE Trans on Information Theory,1979,IT-25:488-490
[15] 陈黎飞,郭躬德,姜青山.自适应的软子空间聚类算法[J].软件学报,2010,1(10):2513-2523
[16] Huang J Z,Ng M K,Rong H,et a1.Automated variable weighting in k-means type clustering[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2005,7(5):657-668
[17] 张健飞,陈黎飞,郭躬德,等.多代表点子空间分类算法[J].计算机科学与探索,2011(11):1037-1048
[18] 李南,郭躬德.基于子空间集成的概念漂移数据流分类算法[J].计算机系统应用,2011,0(12):241-248

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!