计算机科学 ›› 2013, Vol. 40 ›› Issue (6): 225-228.

• 人工智能 • 上一篇    下一篇

基于正则化互信息和差异度的集成特征选择

姚旭,王晓丹,张玉玺,薛爱军   

  1. 空军工程大学防空反导学院 西安 710051;空军工程大学防空反导学院 西安 710051;空军工程大学防空反导学院 西安 710051;空军工程大学防空反导学院 西安 710051
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金项目(60975026,5)资助

Ensemble Feature Selection Based on Normalized Mutual Information and Diversity

YAO Xu,WANG Xiao-dan,ZHANG Yu-xi and XUE Ai-jun   

  • Online:2018-11-16 Published:2018-11-16

摘要: 如何构造差异性大的基分类器是集成学习研究的重点,为此提出迭代循环选择法:以最大化正则互信息为准则提取最优特征子集,进而基于此训练得到基分类器;同时以错分样本个数作为差异性度量准则来评价所得基分类器的性能,若满足条件则停止,反之则循环迭代直至结束。最后用加权投票法融合所选基分类器的识别结果。通过仿真实验验证算法的有效性,以支持向量机为分类器,在公共数据集UCI上进行实验,并与单SVM及经典的Bagging集成算法和特征Bagging集成算法进行对比。实验结果显示,该方法可获得较高的分类精度。

关键词: 集成学习,集成特征选择,互信息,差异性

Abstract: How to generate classifiers with higher diversity is an important problem in ensemble learning,consequently,an iterative algorithm was proposed as follows:base classifier is trained using optimal feature subset which is selected by maximum normalized mutual information,simultaneously,the attained base classifier is measured by the diversity based on the number of miss classified samples.The algorithm stops if satisfy,otherwise iterates until end.Finally,weighted voting method is utilized to fusion the base classifiers’ recognition results.To attest the validity,we made experiments on UCI data sets with support vector machine as the classifier,and compared it with Single-SVM,Bagging-SVM and AB-SVM.Experimental results suggest that our algorithm can get higher classification accuracy.

Key words: Ensemble learning,Ensemble feature selection,Mutual Information,Diversity

[1] Opitz D.Feature selection for Ensembles[C]∥Proceedings ofAmerican Association for Artificial Intelligence.1999:379-384
[2] Ho T K.The random subspace method for constructing decision forests[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,1998,20(8):832-844
[3] Brylla R,Osunab R G,Queka F.Attribute Bagging:Improving accuracy of classifier ensembles by using random feature subsets[J].Pattern Recognition,2003,36(6):1291-1302
[4] Oliveira L S,Morita M,Sabourin R.Multi-Objective Genetic Algorithm Create Ensemble of Classifiers[C]∥Proc.OFEMO 2005.Guanajuato,Mexico,2005:592-606
[5] 李霞,王连喜,蒋盛益.面向不平衡问题的集成特征选择[J].山东大学学报:工学版,2011,1(3):7-11
[6] 孙亮,韩崇昭,沈建京,等.集成特征选择的广义粗集方法与多分类器融合[J].自动化学报,2008,34(3):298-304
[7] 张宏达,王晓丹,韩钧,等.分类器集成差异性研究[J].系统工程与电子技术,2009,31(12):3007-3012
[8] Dietterich T G.Ensemble methods in machine learning[C]∥Proc.The 1st Int’l Workshop on Multiple Classifier Systems(MCS 2000).Italy,LNCS,Springer,2000:1-15
[9] Kuncheva L I,Skurichina M,Duin R P W.An experimentalstudy on diversity for bagging and boosting with linear classi-fiers [J].Information Fusion,2002,3:245-258
[10] Dietterich T G.An experimental comparison of three methods for constructing ensembles of decision trees:bagging,boosting,and randomization[J].Machine Learning,2000,40:139-158
[11] Bonnlander B V,Weigend A S.Selecting input variables using mutual information and nonparametric density evaluation[C]∥ISSAN 94.1994:42-50
[12] Kwak N,Choi C-H.Input feature selection by mutual information based on Parzen window [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(12):1667-1671
[13] Peng Han-chuan,Long Fu-hui,Ding C.Feature Selection Based on Mutual Information:Criteria of Max-Dependency,Max-Relevance,and Min-Redundancy [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1226-1238
[14] Estévez P A,Tesmer M,Perez C A,et al.Normalized Mutual Information Feature Selection [J].IEEE Transactions on Neural Networks,2009,20(2):189-201
[15] Vinh L T,Lee S,Park Y-T,et al.A novel feature selectionmethod based on normalized mutual information [J].Springer Science Business Media,2012,37(1):100-120
[16] Hansen L K,Salamon P.Neural network ensembles[J].IEEE Transactions on Pattern analysis and Machine Intelligence,1990,12(10):993-1001
[17] Valentini G,Dietterich T G.Bias-variance Analysis of SupportVector Machines for the Development of SVM-Based Ensemble Methods [J].Journal of Machine Learning Research,2004,5:725-775
[18] 王晓丹,孙东延,郑春颖.一种基于AdaBoost的SVM分类器[J].空军工程大学学报:自然科学版,2006,7(6):54-57
[19] Breiman L.Bagging Predictors[J].Machine Learning,1996,24(2):123-140

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!