计算机科学 ›› 2016, Vol. 43 ›› Issue (Z6): 219-221.doi: 10.11896/j.issn.1002-137X.2016.6A.053

• 模式识别与图像处理 • 上一篇    下一篇

单核苷酸多态性在疾病相关性分析中的编码问题研究

赵婧,魏彬,张瑾   

  1. 西京学院控制工程学院 西安710123,武警工程大学电子技术系 西安710086,西京学院控制工程学院 西安710123
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受陕西省教育厅科研计划项目(15JK2187),西京学院科研基金项目(XJ140115),武警工程大学基础研究基金项目(WJY201518)资助

Research on Single Nucleotide Polymorphism Encoding in Disease Association Studies

ZHAO Jing, WEI Bin and ZHANG Jin   

  • Online:2018-12-01 Published:2018-12-01

摘要: 作为第三代遗传标记的单核苷酸多态性(SNP)具有数量众多、分布广泛且遗传稳定性等特点,其是疾病-基因相关性以及药物设计等研究的基础所在。这类研究多采用基于计算的方法,因此如何对SNP进行适当的编码进而提升算法的性能是其中十分关键的一个环节,然而目前专门针对SNP编码问题的研究还相对较少。在常用SNP表示方式的基础上,根据疾病易感性研究的特点,并结合SNP之间的关联性,提出了几种新的编码方法。大量实验表明,编码方式对疾病易感性分析算法的性能有着较大的影响,基于分布信息的编码方法能获得更好的结果,即其能更好地对SNP序列进行描述,在最大程度上保留原有生物序列所携带的丰富信息,更适合于疾病易感性研究。

关键词: 单核苷酸多态性,编码,疾病易感性

Abstract: Due to the SNP has some characteristics (such as high abundance and low mutation rate),they are suitable for disease association studies.Lots of those studies were based on calculated methods,so encoding the SNP to enhance the performance of disease associated analysis algorithm was critical aspect.However,few of studies were dedicated to that issue.Therefore,based on common SNP encoding method and association between them,we proposed several new encoding methods.The experiments results show that encoding methods has a greater impact on algorithm performance,and the methods described herein are better than others.Namely,the encoding methods proposed in this paper are better to describe the SNP sequence and retain the original biological sequence information,and are more suitable for disease susceptibility research.

Key words: Single nucleotide polymorphism,Encoding,Disease susceptibility

[1] Zhang Han,Shi Jian-xin,Liang Fa-ming,et al.A fast multilocus test with adaptive SNP selection for large-scale genetic-association studies[J].European Journal of Human Genetics,2014,22(5):696-702
[2] Liu Xin-yu,Wang Yu-peng,Sriram T N.Determination of sample size for a multi-class classifier based on single-nucleotide po-lymorphisms:a volume under the surface approach[J].Bmc Bioinformatics,2014,15:190-198
[3] Vogler C,Gschwind L,Coynel D,et al.Substantial SNP-based heritability estimates for working memory performance[J].Translational Psychiatry,2014,4(9):438-438
[4] Schierding W,Cutfield W S,O’Sullivan J M.The missing story behind Genome Wide Association Studies:single nucleotide po-lymorphisms in gene deserts have a story to tell[J].Frontiers in Genetics,2014,5:39
[5] Wei Bin,Peng Qin-ke,Kang Xue-jiao.A Hybrid Feature Selection Algorithm used in Disease Association Study[C]∥the 8th World Congress on Intelligent Control and Automation.2010,5:2931-2935
[6] Talluri R,Wang Jian,Shete S.Calculation of exact p-valueswhen SNPs are tested using multiple genetic models[J].Bmc Genetics,2014,15:75
[7] Roshyara N R,Kirsten H,Horn K,et al.Impact of pre-imputation SNP-filtering on genotype imputation results[J].Bmc Genetics,2014,15:88-99
[8] Richardson A M,Lidbury B A.Infection status outcome,ma-chine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data[J].Bmc Bioinformatics,2013,14:206-221
[9] Duan L,Thomas D C.A Bayesian Hierarchical Model for Relating Multiple SNPs within Multiple Genes to Disease Risk[J].International Journal of Genomics,2013,15:406217
[10] 周家蓬,裴智勇,陈禹保,等.基于高通量测序的全基因组关联研究策略[J].遗传,2014,10(5):1-22
[11] Thieme S,Groth P.Genome Fusion Detection:a novel method to detect fusion genes from SNP-array data[J].Bioinformatics,2013,29(6):671-677
[12] Brinza D,Zelikovsky A.Design and validation of methodssearching for risk factors in genotype case-control studies[J].Journal of Computational Biology,2008,15(1):81-90
[13] Wei Bin,Peng Qin-ke,Zhang Quan-wei.Identification of Combination of SNPs Associated with Graves’ Disease using Swarm Intelligence[J].Science China Life Sciences,2011,2(2):139-145
[14] Wei Bin,Peng Qin-ke,Li Jing,et al.USVM:Selection of SNPs in diseases association study using UMDA and SVM[C]∥4th International Conference on Bioinformatics and Biomedical Engineering.2010
[15] Speed D,Balding D J.MultiBLUP:improved SNP-based prediction for complex traits[J].Genome Research,2014,15,24(9):1550-1557
[16] Wei Bin,Peng Qin-ke,Li Chen-yao.A Hybrid of Binary Particle Swarm Optimization and Estimation Distribution Algorithm for Feature Selection[C]∥6th International Conference on Natural Computation.2010:2510-2514
[17] Sluga D,Curk T,Zupan B,et al.Heterogeneous computing architecture for fast detection of SNP-SNP interactions[J].Bmc Bioinformatics,2014,15:216-222

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!