计算机科学 ›› 2015, Vol. 42 ›› Issue (3): 241-244.doi: 10.11896/j.issn.1002-137X.2015.03.050
孟 军,尉双云
MENG Jun and YU Shuang-yun
摘要: 针对高维数据中的类标记仅与少部分特征关联紧密的问题,提出了基于排序聚合和聚类分组的特征随机选择集成学习方法。采用排序聚合技术对特征进行过滤,选出与样本分类相关的特征,以bicor关联系数作为关联衡量标准,利用近邻传播聚类算法进行分组,使不同组的特征互不关联,然后从每个分组中随机选择一个特征生成特征子集,便可得到多个既存在差异性又具备区分能力的特征子集,最后分别在对应的特征子空间训练基分类器,采用多数投票进行融合集成。在7个基因表达数据集上的实验结果表明,提出的方法分类误差较低,分类性能稳定,可扩展性好。
[1] Opitz D W.Feature selection for ensembles[C]∥Proceedings of 16th National Conference on Artificial Intelligence (AAAI-99).Orlando,FL,USA,1999:379-384 [2] Ho T K.The random subspace method for constructing decision forests[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(8):832-844 [3] Breiman L.Random forests[J].Machine learning,2001,45(1):5-32 [4] De Bock K W,Coussement K,Van den Poel D.Ensemble classification based on generalized additive models[J].Computational Statistics & Data Analysis,2010,54(6):1535-1546 [5] 姚旭,王晓丹,张玉玺,等.基于正则化互信息和差异度的集成特征选择[J].计算机科学,2013,40(6):225-228 [6] Moon H,Ahn H,Kodell RL,et al.Ensemble methods for classification of patients for personalized medicine with high-dimensional data[J].Artificial Intelligence in Medicine,2007,1(3):197-207 [7] Liu Hua-wen,Liu Lei,Zhang Hui-jie.Ensemble gene selection by grouping for microarray data classification[J].Journal of Biomedical Informatics,2010,43(1):81-87 [8] Wald R,Khoshgoftaar T M,Dittman D.Mean aggregation versus robust rank aggregation for ensemble gene selection[C]∥2012 11th International Conference on Machine Learning and Applications (ICMLA).Boca Raton,FL,USA,2012:63-69 [9] Lin Song,Langfelder P,Horvath S.Comparison of co-expression measures:mutual information,correlation,and model based indices[J].BMC Bioinformatics,2012,13(1):328 [10] Frey B J,Dueck D.Clustering by passing messages between data points[J].Science,2007,315(5814):972-976 [11] Boulesteix A L,Slawski M.Stability and aggregation of ranked gene lists[J].Briefings in Bioinformatics,2009,10(5):556-568 [12] Wald R,Khoshgoftaar T M,Dittman D,et al.An extensive comparison of feature ranking aggregation techniques in bioinformatics[C]∥2012 IEEE 13th International Conference on Information Reuse and Integration (IRI).Las Vegas,NV,USA,2012:377-384 [13] Wang Chang-dong,Lai Jian-huang,Suen C,et al.Multi-Exem-plar Affinity Propagation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,5(9):2223-2237 [14] Wang Yu-hang,Makedon F S,Ford J C,et al.HykGene:a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data[J].Bioinforma-tics,2005,21(8):1530-1537 [15] Sakellariou A,Sanoudou D,Spyrou G.Combining multiple hy-pothesis testing and affinity propagation clustering leads to accurate,robust and sample size independent classification on gene expression data[J].BMC Bioinforma-tics,2012,13(1):270 [16] Hardin J,Mitani A,Hicks L,et al.A robust measure of correlation between two genes on a microarray[J].BMC Bioinforma-tics,2007,8(1):220 [17] Chang C C,Lin C J.LIBSVM:a library for support vector machines[J].ACM Transactions on Intelligent Systems and Technology,2011,2(3):27 |
No related articles found! |
|