计算机科学 ›› 2015, Vol. 42 ›› Issue (6): 37-40.doi: 10.11896/j.issn.1002-137X.2015.06.008

• 第十届和谐人机环境联合学术会议 • 上一篇    下一篇

基于相交邻域粗糙集的基因微阵列数据分类

孟军,李锐,郝涵   

  1. 大连理工大学计算机科学与技术学院 大连116024,大连理工大学计算机科学与技术学院 大连116024,大连理工大学计算机科学与技术学院 大连116024
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受辽宁省自然科学基金项目(20130200029)资助

Gene Microarray Data Classification Based on Intersecting Neighborhood Rough Set

MENG Jun, LI Rui and HAO Han   

  • Online:2018-11-14 Published:2018-11-14

摘要: 在对基因微阵列数据的特征选择和分类的研究中,粗糙集理论是一个可以消除冗余基因的有效工具。但是传统的粗糙集模型不能很好地处理连续型数值数据,而离散化方法可能会导致信息的丢失。为此,提出了一种基于相交邻域粗糙集模型的属性约简算法,即将传统粗糙集中的距离邻域扩展为相交邻域,采用基于集合的方式来定义近似,以此构建粗糙集模型。在癌症数据集上进行实验,结果表明基于集合近似和相交邻域的粗糙集模型可以取得较好的分类效果,并且通过对选择出的基因进行GO术语分析,进一步证明了该模型的有效性。

关键词: 粗糙集,相交邻域,基因微阵列数据

Abstract: In the research of gene microarray data classification and feature selection,rough set theory is an effective tool,as it can eliminate redundant genes.However a drawback in traditional rough set is that it cannot handle with continuous numeric data well,and discretization method may lead to the loss of information.We proposed an attribute reduction algorithm based on intersecting neighborhood rough set,extended the distance neighborhood to intersecting neighborhood and employed the definition of approximation based on set,to build the rough set model.Experimental results on three cancer data sets show that the rough set model based on the set approximate and intersecting neighborhood is effective and efficient.Meanwhile,the analysis of GO terms on selected genes further proves the validity of the model.

Key words: Rough set,Intersecting neighborhood,Gene microarray data

[1] Piao Y,Piao M,Park K,et al.An ensemble correlation-basedgene selection algorithm for cancer classification with gene expression data[J].Bioinformatics,2012,28(24):3306-3315
[2] Wang Shu-lin,Li Xue-ling,Zhang Shan-wen,et al.Tumor classification by combining PNN classifier ensemble with neighborhood rough set based gene reduction[J].Computers in Biology and Medicine,2010,40(2):179-189
[3] Tong Mu-chen-xuan,Liu Kun-hong,Xu Chun-gui,et al.An ensemble of SVM classifiers based on gene pairs[J].Computers in Biology and Medicine,2013,43(6):729-737
[4] Kohavi R,John G H.Wrappers for feature subset selection[J].Artificial Intelligence,1997,97(1/2):273-324
[5] Wang Li,Zhu Ji,Zou Hui.Hybrid huberized support vector machines for microarray classification and gene selection[J].Bioinformatics,2008,24(3):412-419
[6] Bolon-Canedo V,Sanchez-Marono N,Alonso-Betanzos A.Anensemble of filters and classifiers for microarray data classification[J].Pattern Recognition,2012,45(1):531-539
[7] Jiao Na,Miao Duo-qian.An efficient gene selection algorithmbased on tolerance rough set theory[J].Data Mining and Granu-lar Computing,2009,5908:176-183
[8] Pawlak Z.Rough sets[J].Computer and Information Science,1982,11(5):341-356
[9] Jensen R,Shen Q.Fuzzy-rough attribute reduction with applica-tion to web categorization[J].Fuzzy Sets and Systems,2004,141(3):469-485
[10] Paul S,Maji P.Rough set based gene selection algorithm for microarray sample classification[C]∥International Conference on Methods and Models in Computer Science.New Delhi,2010:7-13
[11] Lu Zheng-cai,Qin Zheng,Zhang Yong-qiang,et al.A fast feature selection approach based on rough set boundary regions[J].Pattern Recognition Letters,2014,36(15):81-88
[12] 胡清华,于达仁.基于邻域粒化和粗糙逼近的数值属性约简[J].软件学报,2008,19(3):640-649 Hu Qing-hua,Yu Da-ren.Numerical Attrrbute Reduction Based on Neighborhood Granulation and Rough Approximation[J].Journal of Software,2008,9(3):640-649
[13] Pawlak Z.Rough sets:theoretical aspects of reasoning about data[M].1991
[14] Meng Jun,Wang Xiu-kui,Wang Peng,et al.Knowledge Depen-dency and Rule Induction on Tolerance Rough Sets [J].Journal of Multiple-Valued Logic and Soft Computing,2013,0(3/4):401-421
[15] Orr S J,Morgan N M,Elliott J,et al.CD33 Responses areBlocked by SOCS3 through Accelerated Proteasomal-mediated Turnover[J].Blood,2007,109(3):1061-1068
[16] Mark P K,Comeils M,Sun X H,et al.A new Homeobox Gene Contributes the DNA Binding Domain of the t(1;19) Translocation Protein in pre-B ALL[J].Cell,1990,60(4):547-555
[17] Sicinska E,Aifantis I,Laurent L C,et al.Requirement for Cyclin D3 in Lymphocyte Development and T Cell Leukemias[J].Cancer Cell,2003,4(6):451-461
[18] Mertelsmann R,Steven G,Steinmann G,et al.T-cell GrowthFactor (Interleukin 2) and Terminal Transferase Activity in Human Leukemias and Lymphoblastic Cell Lines[J].Blut,1981,43(2):99-103
[19] Min Fan,William Z.Attribute reduction of data with error ranges and test costs[J].Information Sciences,2012,211:48-67

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!