计算机科学 ›› 2015, Vol. 42 ›› Issue (11): 251-255.doi: 10.11896/j.issn.1002-137X.2015.11.051

• 人工智能 • 上一篇    下一篇

基于精简关联度的基因表达数据迭代填补算法

何云,皮德常   

  1. 南京航空航天大学计算机科学与技术学院 南京210016,南京航空航天大学计算机科学与技术学院 南京210016
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金 (U1433116),江苏省“333”高层次人才工程,航空科学基金(20145752033)资助

Iterative Imputation Algorithm Based on Reduced Relational Grade for Gene Expression Data

HE Yun and PI De-chang   

  • Online:2018-11-14 Published:2018-11-14

摘要: 基因表达数据时常出现缺失,阻碍了对基因表达的研究。提出了一种新的相似性度量方案——精简关联度,在此基础上,又提出了基于精简关联度的缺失数据迭代填补算法(RKNNimpute)。精简关联度是对灰色关联度的一种改进,能达到与灰色关联度同样的效果,却显著降低了算法的时间复杂度。RKNNimpute算法以精简关联度作为相似度量,将填补后的基因扩充到近邻的候选基因集,通过迭代的方式填补其他缺失数据,提高了算法的填补效果和性能。选用时序、非时序、混合等不同类型的基因表达数据集进行了大量实验来评估RKNNimpute算法的性能。实验结果表明,精简关联度是一种高效的距离度量方法,所提出的RKNNimpute算法优于常规填补算法。

关键词: 基因表达数据,精简关联度,填补,迭代,缺失值

Abstract: Gene expression data frequently suffers from missing value,which adversely affects downstream analysis.A new similarity metric method named reduced relational grade was proposed.Based on this,we presented the iterative imputation algorithm for gene expression data (RKNNimpute).Reduced relational grade is an improvement of gray relational grade.The former can achieve the same performance as the latter while greatly reducing the time complexity.RKNNimpute imputes missing value iteratively by considering the reduced relational grade as similarity metric and expanding the set of candidate genes to nearest neighbors with imputed genes,which improves the effect and performance of the imputation algorithm.We selected data sets of different kind,such as time series,non-time series and mixed,and then experimentally evaluated the proposed method.The results demonstrate that the reduced relational grade is effective and RKNNimpute outperforms common imputation algorithms.

Key words: Gene expression data,Reduced relational grade,Imputation,Iteration,Missing value

[1] Hoheisel J D.Microarray technology:beyond transcript profiling and genotype analysis [J].Nature Reviews Genetics,2006,7(3):200-210
[2] De Brevern A G,Hazout S,Malpertuy A.Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering [J].BMC Bioinformatics,2004,5(1):114-119
[3] Yang Y H,Buckley M J,Dudoit S,et al.Comparison of methods for image analysis on cDNA microarray data[J].Journal of Computational and Graphical Statistics,2002,11(1):108-136
[4] Pedro J,Garcia-Laencina,et al.K nearest neighbours with mutualinformation for simultaneous classification and missing data imputation [J].Neurocomputing,2009,72(7-9):1483-1493
[5] Moorthy K,Mohamad M S,Deris S.A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data [J].Current Bioinformatics,2014,9(1):18-22
[6] Song Qin-bao,Shepperd M,Chen Xiang-ru,et al.Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation [J].Journal of Systems and Software,2008,81(12):2361-2370
[7] Troyanskaya O,Cantor M,Sherlock G.Missing value estimation methods for DNA microarrays [J].Bioinformatics,2001,17(6):520-525
[8] Alan Wee-Chung,Law Ngai-Fong, Yan Hong.Missing value imputation for gene expression data:computational technique to recover missing data from available information [J].Briefings in Bioinformatics,2010,12(5):498-513
[9] Meng Fan-chi,Cheng Cai,Hong Yan.A Bicluster-Based Baye-sian Principal Component Analysis Method for Microarray Mis-sing Value Estimation [J].Biomedical and Health Informatics,2014,18(3):862-871
[10] Zhang Shi-chao.Shell-neighbor method and its application inmissing data [J].Applied Intelligence,2011,35(1):123-133
[11] 杨涛.基因表达缺失数据填充算法研究[D].长沙:湖南大学,2005Yang Tao.The research on imputation algorithm of missing va-lues for gene expression data [D].Changsha:Hunan University,2005
[12] Zhang Shi-chao.NIIA:Nonparametric Iterative Imputation Algorithm[C]∥Trends in Artificial Intelligence,2008(PRICAI 2008).Berlin:Springer Berlin Heidelberg,2008:544-555
[13] Song Qin-bao,Shepperd M.Predicting software project effort:A grey relational analysis based method [J].Expert Systems with Applications,2011,38(6):7302-7316
[14] Bras L P,Menezes J C.Improving cluster-based missing value estimation of DNA microarray data [J].Biomolecular Enginee-ring,2007,24(2):273-282
[15] 李艳芳.基因表达数据的缺失值估计研究[D].哈尔滨:哈尔滨工业大学,2011 Li Yan-fang.Research on missing value imputation for microarray gene expression data [D].Harbin:Harbin Institute of Technology,2011

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!