计算机科学 ›› 2014, Vol. 41 ›› Issue (10): 291-294.doi: 10.11896/j.issn.1002-137X.2014.10.061
陈涛,洪增林,邓方安
CHEN Tao,HONG Zeng-lin and DENG Fang-an
摘要: DNA微阵列技术可以同时检测细胞内成千上万的基因的活性,被广泛应用于重大基因疾病的临床诊断。然而微阵列数据通常具有高维小样本特点,且存在大量噪声和冗余基因。为了进一步提高微阵列数据分类性能,提出一种特征基因混合选择算法。首先采用ReliefF算法剔除大量无关基因,获得特征基因候选子集;然后采用基于差分进化算法优化的邻域粗糙集模型实现特征基因选择;最后利用支持向量机进行分类,以验证算法的有效性。仿真实验结果表明,该算法能用尽可能少的特征基因来获得更高的分类精度,既增强了算法的泛化性能,又提高了时间效率,而且对致病基因的临床诊断有着重要的参考意义。
[1] Derisi J L,Iyer V R,Brown P O.Exploring the metabolic andgenetic control of gene expression on a genomics [J].Science,1997 ,278(5338):680-686 [2] Zhao Y H,Wang G R,Li Y,et al.Finding novel diagnostic gene patterns based on interesting non-redundant contrast sequence rules[C]∥ International Conference on Data Mining.2011:972-981 [3] Zhao Y H,Yin Y,Wang G R.Identifying top-k vital patterns from multiclass medical data[C]∥BioMedical Information Engineering.2009:536-39 [4] Zhao Y H,Yu X J,Wang G R,et al.Maximal subspace coregulated gene clustering[J].IEEE Transactions on Knowledge and Data Engineering,2008,0(1):83-98 [5] Golub T R,Slonim D K,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286:531-537 [6] Arfin S M,Long A D,Ito E T.Global gene expression profiling in esherichia coliK12:the effects of integration host factor[J].Journal of Biological Chemistry,2000,275:29672-29684 [7] Tusher V G,Tibshirani R,Chu G.Significance analysis of microarrays applied to the ionizing radiation response[J].PNAS,2001,98:5116-5121 [8] Pan W.A Comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments [J].Journal of Bioinformatics,2002,18:546-554 [9] Kira K,Rendell L A.A practical approach to feature selection[C]∥Proceedings of the Ninth International in Machine Lear-ning Conference.1992:145-156 [10] Kononenko I.Estimating attributes:analysis and extensions of RELIEF[C]∥Proceedings of the European Conference on Machine Learning,Lecture Notes in Computer Science.1994,4:171-182 [11] 张文修,仇国芳.粗糙集属性约简的一般理论[J].中国科学E辑:信息科学,2005,2:1304-1313 [12] Yao Y,Yao B.Covering based rough set approximations[J].Information Sciences,2012,200:91-107 [13] 胡清华,于达仁.基于邻域粒化和粗糙逼近的数值属性约简[J].软件学报,2008,5(3):121-125 [14] 胡清华,赵辉,于达仁.基于邻域粗糙集的符号与数值属性快速约简算法[J].模式识别与人工智能,2008,1(6):89-95 [15] Storn R,Price K.Differential evolution simple and efficient adaptive scheme for global optimization over continuous spaces[R].Berkeley:University of California,2006 [16] Zhao H.Intrusion Detection Ensemble Algorithm based on Bagging and Neighborhood Rough Set[J].International Journal of Security and Its Applications,2013,7(5):193-204 [17] Chen T.Classification algorithm on gene expression profiles of tumor using neighborhood rough set and support vector machine[J].Advanced Materials Research,2014,0:1238-1242 [18] 赵晖.融合邻域粗糙集与粒子群优化的网络入侵检测[J].计算机工程与应用,2013,8:73-77 [19] 赵晖.基于邻域粗糙集与KNN的网络入侵检测[J].河南科学,2013,9:1404-1408 [20] 雍龙泉.求解一类多目标优化问题的极大熵差分进化算法[J].中南大学学报,2013,S2:160-164 [21] Khan J.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nature Medicine,2001,7(6):673-679 [22] Yeoh E J.Classification,subtype discovery,and prediction ofoutcome in pediatric acute lymphoblastic leukemia by gene expression profiling [J].Cancer Cell,2002,1(2):133-143 [23] 周艳平,顾幸生.差分进化算法研究进展[J].化工自动化及仪表,2007,4(3):1-5 |
No related articles found! |
|