计算机科学 ›› 2015, Vol. 42 ›› Issue (10): 281-286.

• 人工智能 • 上一篇    下一篇

基于粗糙集的加权KNN数据分类算法

刘继宇,王 强,罗朝晖,宋 浩,张绿云   

  1. 广西师范大学计算机科学与信息工程学院 桂林541004,广西师范大学计算机科学与信息工程学院 桂林541004,广西师范大学计算机科学与信息工程学院 桂林541004,广西师范大学计算机科学与信息工程学院 桂林541004,河池学院计算机与信息工程学院 宜州546300
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金地区项目(61165009),国家自然科学基金(61365009)资助

Weighted KNN Data Classification Algorithm Based on Rough Set

LIU Ji-yu, WANG Qiang, LUO Zhao-hui, SONG Hao and ZHANG Lv-yun   

  • Online:2018-11-14 Published:2018-11-14

摘要: 粗糙集是处理不精确、不确定性问题的基本方法之一。采用粗糙集理论与方法进行数据分析具有不必具备数据集的先验知识、不需人为设定参数等优点,因而它被广泛应用于模式识别与数据挖掘领域。针对粗糙集训练过程中从未遇到过的样本的分类问题进行了探讨,根据条件属性的重要性确定加权系数,采用加权KNN的方法来解决无法与决策规则精确匹配的样本分类问题,并与加权最小距离方法进行了对比实验;同时对其他一些现有的粗糙集值约简算法进行了分析与研究,提出了不同的观点。对UCI多个数据集的大量数据进行了实验,并与近期文献中的多种算法进行了性能对比,实验结果表明,提出的算法的总体效果优于其他算法。

关键词: 粗糙集,加权KNN,加权最小距离,属性值约简

Abstract: Rough set is one of the basic methods in dealing with the imprecise or indefinite problems.For its advantages that the priori knowledge about analyzing dataset isn’t necessary and the parameters analysis needn’t to be set artificially,rough set is widely used in pattern recognition and data mining fields.For rough set theory,a core problem is how to classify the sample which has never been met in the process of training.This problem was discussed in detail in this paper.According to the importance of the condition attributes,a weighted KNN algorithm was proposed to classify the samples which can’t precisely match to decision rules,and the contrast test with the weighted minimum distance (WMD) method was made to show the efficiency of our algorithm.At the same time,the existing algorithms about the attribute value reduction in rough set were analyzed and another point of view was put forward. The experiments on several UCI data sets and comparison with various existing algorithms proposed recently show that our algorithm is superior to these algorithms in overall effect.

Key words: Rough set,Weighted KNN,Weighted minimal distance,Attribute value reduction

[1] Pawlak Z.Rough sets:Theoretical aspects of reasoning about data[M].Dordrecht & Boston:Kluwer Academic Publishers,1991
[2] Theodoridis S,Koutroumbas K.模式识别(第2版)[M].李晶皎,朱志良,王爱侠,等译.北京:电子工业出版社,2004
[3] Mitra S.An Evolutionary Rough Partitive Clustering Pattern[J].Recognition Letters,2004,25(12):1439-1449
[4] Gibert K,Rodríguez-Silva G,Rodríguez-Roda I.Knowledge discovery with clustering based on rules by states:A water treatment application[J].Environmental Modelling & Software,2010,25(6):712-723
[5] Lai J Z C,Juan E Y T,Lai F J C.Rough clustering using gene-ralized fuzzy clustering algorithm[J].Pattern Recognition,2013,46(9):2538-2547
[6] 安利平,陈增强,袁著祉.基于粗集理论的多属性决策分析[J].控制与决策,2005,20(3):294-298 An Li-ping,Chen Zeng-qiang,Yuan Zhu-zhi.Multi attribute decision analysis based on rough set theory [J].Control and Decision,2005,20(3):294-298
[7] 马峻,吉晓民.利用粗糙集理论实现工艺决策的冲突消解[J].计算机辅助设计与图形学报,2005,17(3):600-604 Ma Jun,Ji Xiao-min.Implementation of Conflict Resolution for Process Decision Based on Rough Theory [J].Journal of Computer Aided Design & Computer Graphics,2005,17(3):600-604
[8] 王国胤.Rough集理论与知识获取[M].西安:西安交通大学出版社,2001 Wang Guo-yin.Rough set theory and knowledge acquisition [M].Xi’an:Xi’an Jiaotong University Press,2001
[9] 张文修,吴伟志,梁吉业,等.粗糙集理论与方法[M].北京:科学出版社,2001 Zhang Wen-xiu,Wu Wei-zhi,Liang Ji-ye,et al.The rough set theory and method [M].Beijing:Science Press,2001
[10] Hu X,Cercone N.Learning in relational databases:a rough set approach[J].Computational Intelligence,1995,11(2):323-338
[11] Swiniarski R W,Skowron A.Rough set methods in feature selection and recognition[J].Pattern Recognition Letters,2003,24(6):833-849
[12] Felix R,Ushio T.Rough sets-based machine learning using a binary discernibility matrix[C] ∥Proceedings of the Second International Conference on Intelligent Processing and Manufacturing of Materials,1999(IPMM’99).IEEE,1999:299-305
[13] 杨萍,李济生,黄永宣.一种基于二进制区分矩阵的属性约简算法[J].信息与控制,2009,38(1):70-74 Yang Ping,Li Ji-sheng,Huang Yong-xuan.A attribute reduction algorithm based on binary Discernibility Matrix [J].Information and Control,2009,38(1):70-74
[14] 张颖淳,苏伯洪,曹娟.基于粗糙集的属性约简在数据挖掘中的应用研究[J].计算机科学,2013,40(8):223-226 Zhang Ying-chun,Su Bo-hong,Cao Juan.Study on application of Attributive Reduction Based on Rough set in Data mining [J].Computer Science,2013,40(8):223-226
[15] 常犁云,王国胤,吴渝.一种基于Rough Set理论的属性约简及规则提取方法[J].软件学报,1999,10(11):1206-1211 Chang Li-yun,Wang Guo-yin,Wu Yu.A Method of Attribute Reduction and Rule Extraction Based on Rough Set Theory[J].Journal of Software,1999,10(11):1206-1211
[16] 鄂旭,邵良杉,张毅智,等.一种基于粗糙集理论的规则提取方法[J].计算机科学,2011,38(1):232-235 E Xu,Shao Liang-shan,Zhang Yi-zhi,et al.Method of Rule Extraction Based on Rough Set Theory [J].Computer Science,2011,38(1):232-235
[17] 张利,卢秀颖,吴华玉,等.基于粗糙集的启发式值约简的改进算法[J].仪器仪表学报,2009(1):82-85 Zhang Li,Lu Xiu-ying,Wu Hua-yu,et al.Improved heuristic algorithm used in attribute value reduction of rough set [J].Chinese Journal of Scientific Instrument,2009(1):82-85
[18] Suresh B V,Viswanath P.Rough-fuzzy weighted k-nearest lea-der classifier for large data sets[J].Pattern Recognition,2009,42(9):1719-1731
[19] Astudillo C S A,Oommen B J.On achieving semi-supervised pattern recognition by utilizing tree-based SOMs[J].Pattern Recognition,2013,46(1):293-304
[20] 任靖,李春平.最小距离分类器的改进算法-加权最小距离分类器[J].计算机应用,2005,25(5):992-994 Ren Jing,Li Chun-ping.Improved minimum distance classifier-weighted minimum distance classifier[J].Computer Applications,2005,25(5):992-994

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!