计算机科学 ›› 2013, Vol. 40 ›› Issue (7): 239-243.

• 人工智能 • 上一篇    下一篇

改进的BPSO的特征基因选择方法及其在结肠癌检测中的应用研究

柴欣,孙劲耀,郭磊,武优西   

  1. 河北工业大学计算机科学与软件学院 天津300401;河北工业大学计算机科学与软件学院 天津300401;河北工业大学河北省电磁场与电器可靠性重点实验室 天津300130;河北工业大学计算机科学与软件学院 天津300401
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受河北省自然科学基金(H2012202035),河北省教育厅重点项目(ZH2012038),河北省高等学校青年基金项目(SQ121006)资助

Feature Gene Selection Based on Improved Binary Particle Swarm Optimization Algorithm and its Application in Detection of Colon Cancer

CHAI Xin,SUN Jing-yao,GUO Lei and WU You-xi   

  • Online:2018-11-16 Published:2018-11-16

摘要: 为了避免二进制粒子群算法(BPSO)容易陷入局部极值的缺陷,提出了一种改进的二进制粒子群算法(IBPSO)。该算法在运行过程中引入遗传算法的交叉和变异策略,以便增加种群的多样性,避免粒子的早熟收敛;同时采用免疫算法的疫苗机制,通过合理的疫苗提取、疫苗接种、疫苗选择有效地抑制种群退化的可能。首先采用Wilcoxon秩和检验指标来获得对分类起较大作用的预选特征子集,然后利用IBPSO算法对基因的特征子集和支持向量机(SVM)的参数进行寻优,最后采用IBPSO算法对结肠癌检测问题进行了研究。实验结果表明,该方法可以在较少的特征基因下取得较高精度,且所选的特征基因与结肠癌密切相关,进一步验证了方法的可行性和有效性。

关键词: 特征选择,粒子群算法优化,支持向量机,秩和检验 中图法分类号TP181,TP391.4文献标识码A

Abstract: In order to avoid local optimal solution of Binary Particle Swarm Optimization algorithm,an Improved Binary Particle Swarm Optimization (IBPSO) algorithm was presented.In this approach,the crossover and mutational strategies are introduced to increase the diversity of populations and avoid the premature-convergence of particles.Vaccine extraction,vaccination and immune selection are used to realize the vaccine mechanism to control the population degradation.In order to reduce the features of the tumor,Wilcoxon is used to remove the useless genes.IBPSO algorithm is used to optimize the subset of features and the parameters of Support Vector Machine (SVM).Finally,this method mentioned above is applied to detect the key genes of colon cancer dataset.The experimental results show that our approach can get higher classification accuracy with smaller size of feature subset than that of some other approaches and the selected genes are proven to be disease-causing.The experimental results also verify the correctness and effectiveness of our approach.

Key words: Feature selection,Particle swarm optimization algorithm,Support vector machine,Wilcoxon

[1] 李霞,张田文,郭政.一种基于递归分类树的集成特征基因选择方法[J].计算机学报,2004,27(5):675-682
[2] 徐菲菲,苗夺谦,魏莱.基于模糊粗糙集的肿瘤分类特征基因选取[J].计算机科学,2009,36(3):196-200
[3] Salem D A,Seoud R A,Ali H A.DMCA:A combined data mi-ning technique for improving the microarray data classification accuracy[A]∥2011International Conference on Environment and Bioscience,2011[C].Singapore:IACSIT Press,2011:36-41
[4] 周昉,何洁月.生物信息学中的基因芯片的特征选择技术综述[J].计算机科学,2007,34(12):143-150
[5] Shen Q,Shi W,Wei K.Hybrid particle swarm optimization and tabu search approach for selecting genes for tumor classification using gene expression data[J].Computational Biology and Chemistry,2008,32(1):53-60
[6] Li S,Wu X,Hu X.Gene selection using genetic algorithm and support vectors machines[J].Soft Computing,2008,12(7):693-698
[7] Paul T K,Iba H.Extraction of informative genes from microarray data[A]∥Proceedings of the 2005Conference on Genetic and Evolutionary Computation,2005[C].Washington,DC,USA:ACM,2005:453-460
[8] Zhang C,Tian Y,Deng N.The new interpretation of supportvector machines on statistical learning theory[J].Science China Mathematics,2010,53(1):151-164
[9] Damaevicˇius R.Optimization of SVM parameters for recognition of regulatory DNA sequences[J].Top.,2010,18(2):339-353
[10] Guo L,Wu Y,Zhao L,et al.Classification of mental task from EEG signals using immune feature weighted support vector machines[J].IEEE Transactions on Magnetics,2011,47(5):866-869
[11] Vapnik V N.The nature of statistical learning theory[M].New York:Springer-Verlag,1995
[12] Kennedy J,Eberhart R C.Particle swarm optimization[A]∥Proceedings of the IEEE International Conference on Neural Networks,1995[C].Perth,Australia,1995:1942-1948
[13] Kennedy J,Eberhart R C.A discrete binary version of the particle swarm algorithm[A]∥Proceedings of the IEEE InternationalConference on Systems,Man and Cybernetics,1997[C].Orlando,USA,1997:4104-4108
[14] Xu Y,Liu G.A method of emotion recognition based on ECG signal[A]∥Proceedings of International Conference on Computational Intelligence and Natural Computing,2009[C].Wu Han,China:CINC,2009:202-205
[15] Mohamad M S,Omatu S,Deris S,et al.Particle swarm optimiza-tion with a modified sigmoid function for gene selection from gene expression data[J].Artificial Life and Robotics,2010,15(1):21-24
[16] 吴光华,刘光远,龙正吉.免疫机制对皮肤电信号情感特征选择的影响[J].计算机应用研究,2010,27(12):4558-4564
[17] 吴希贤.基于优化算法的基因选择与癌症分类[D].长沙:湖南大学,2008
[18] Alon U,Barkai N,Notterman D A,et al.Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays[J].Proceedings of the National Academy of Science,1999,96(12):6745-6750
[19] 张焕萍,宋晓峰,王惠南.基于离散粒子群和支持向量机的特征基因选择算法[J].计算机应用与化学,2007,9(24):1159-1162
[20] 王思漫.基于基因表达谱的肿瘤分类方法研究[D].南京:南京理工大学,2012
[21] 李欣.基于决策森林法的肿瘤基因表达谱数据分析[D].北京:北京工业大学,2011
[22] Zhang Z,Li J,Hu H,et al.On the effectiveness of gene selection for microarray classification methods[J].Intelligent Information and Database Systems Lecture Notes in Computer Science,2010,5991(1):300-309
[23] Mohammadi A,Saraee M,Salehi M.Identification of disease-causing genes using microarray data mining and gene ontology[J].BMC Medical Genomics,2011,4(1):12
[24] Guyon I,Weston J,Barnhill S,et al.Gene selection for cancer classification using support vector machines[J].Machine Lear-ning,2002,46(3):389-422
[25] Gordon D.Epidemiologic evidence underscores role for folate as foiler of colon cancer[J].Gastroenterology,1999,116(1):3-4
[26] Karakiulakis G,Papanikolaou C,Jankovic S M,et al.Increased type iv collagen-degrading activity in metastases originating from primary tumors of the human colon[J].Invasion and Metastasis,1997,17(3):158-168

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!