计算机科学 ›› 2015, Vol. 42 ›› Issue (4): 177-180.doi: 10.11896/j.issn.1002-137X.2015.04.035

• 人工智能 • 上一篇    下一篇

基于集成学习的离子通道药物靶点预测

谢倩倩,李订芳,章 文   

  1. 武汉大学数学与统计学院 武汉430072,武汉大学数学与统计学院 武汉430072,武汉大学深圳研究院 深圳518057
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61271337,6),教育部博士点基金(20100141120049),湖北省自然科学基金(2011CDB454),深圳市战略新兴产业发展专项资金项目(JCYJ20130401160028781)资助

Predicting Potential Drug Targets from Ion Channel Proteins Based on Ensemble Learning

XIE Qian-qian, LI Ding-fang and ZHANG Wen   

  • Online:2018-11-14 Published:2018-11-14

摘要: 新药研制成功的关键在于药物靶点的发现和准确定位。在已知的药物靶点中,离子通道蛋白是一类广受欢迎的靶点,它与免疫系统、心血管等疾病密切相关。 对于靶点的发现,传统生物方法成本高、耗时久。因此,探讨了基于机器学习的离子通道蛋白药物靶点的挖掘,以加快药物靶点发现过程,节约经费。由于药物靶点相关序列的长度不一致,考虑了蛋白质序列编码的13种特征,它们能将不等长的蛋白质序列转化成等长序列。通过数值实验筛选能够较好地区分靶点和非靶点的特征子集,并采用集成学习的方法整合特征得到预测模型。通过与已有工作的比较表明,提出的集成模型能得到较高的准确率,具有很好的应用前景。

关键词: 离子通道,随机森林,药物靶点,分类器,集成学习

Abstract: The identification of molecular targets is a critical step in the discovery and development process of new drugs.Among large known drug targets,ion channel proteins are the most attractive drug targets,which are closely linked to some diseases such as cardiovascular and central nervous systems.Traditional biological methods have the characteristics of high-cost and time-consuming in mining drug targets.Our work discussed the mining of potential ion channel drug targets based on random forests,which is aimed at speeding up the discovery process of drug targets and saving money.Since the lengths of sequences related to drug targets are diverse,thirteen types of protein encoding features were considered which can transform the protein sequences with distinct lengths into the sequences with same lengths in our study.A feature subset which has better performance in the division between drug targets and non-targets was chosen by numerical experiments and the ensemble learning was introduced to attain prediction models.Our study attains high accuracy by comparison to the developed methods,which plays the critical roles in the mining of new drug targets.

Key words: Ion channel,Random forests,Drug targets,Classifiers,Ensemble learning

[1] Li Qing-liang,Lai Lu-hua.Prediction of potential drug targets based on simple sequence properties [J].BMC Bioinformatics,2007,8(1):353
[2] Huang Chen,Zhang Rui-jie,Chen Zhi-qiang,et al.Predict potential drug targets from the ion channel proteins based on SVM [J].Journal of Theoretical Biology,2010,262:750-756
[3] Drews J.Drug discovery:a historical perspective [J].Science,2000,287(5460):1960-1964
[4] Imming P,Sinning C,Merver A.Drugs,their targets and the nature and number of drug targets [J].Nature Reviews Drug Discovery,2006,5(10):821-834
[5] Dunlop J,Bowlby M,Peri R,et al.Highthroughput electro-physiology:an emerging paradigm for ion-channel screening and physiology [J].Nature Reviews Drug Discovery,2008,7(4):358-368
[6] Du Qi-shi,Huang Ri-bo,Wang Cheng-hua,et al.Energetic ana-lysis of the two controversial drug binding sites of the M2 proton channel in influenza A virus [J].Journal of Theoretical Bio-logy,2009,259(1):159-164
[7] Huang R B,Du Q S,Wang C H,et al.An in-depth analysis of the biological functional studies based on the NMR M2 channel structure of influenza A virus [J].Biochemical and Biophysical Research Communications,2008,377(4):1243-1247
[8] Pielak R M,Schnell J R,Chou J J.Mechanism of drug inhibition and drug resistance of influenza A M2 channel [J].Proceedings of the National Academy of Sciences of the United States of America,2009,106(5):7379-7384
[9] Schnell J R,Chou J J.Structure and mechanism of the M2 proton channel of influenza A virus [J].Nature,2008,451(7178):591-595
[10] Hopkins A L,Groom C R.The druggable genome[J].Nature Reviews Drug Discovery,2002,1(9):727-730
[11] Russ A P,Lampel S.The druggable genome:an update [J].Drug Discovery Today,2005,10(23/24):1607-1610
[12] Hajduk P J,Huth J R,Tse C.Predicting protein druggability[J].Drug Discovery Today,2005,10(23/24):1675-1682
[13] Kinnings S L,Liu N,Buchmeier N,et al.Drug discovery using chemical systems biology:repositioning the safe medicine Comtan to treat multi-drug and extensively drug resistant tuberculosis [J].PLoS Computational Biology,2009,5(7):e1000423
[14] Xie Li,Li Jerry,Xie Lei,et al.Drug discovery using chemicalsystems biology:identification of the protein-ligand binding network to explain the side effects of CETP inhibitors [J].PLoS Computational Biology,2009,5(5):e1000387
[15] Campillos M,Kuhn M,Gavin A C,et al.Drug target identification using side-effect similarity [J].Science,2008,321(5886):263-266
[16] Wang Yin-ying,Nacher J C,Zhao Xing-ming.Predicting drugtargets based on protein domains [J].Molecular BioSystems,2012,8(5):1528-1534
[17] Han Lian-yi,Zheng Chan-juan,Xie Bin,et al.Support vectormachines approach for predicting druggable proteins:recent progress in its exploration and investigation of its usefulness [J].Drug Discovery Today,2007,12(7/8):304-313
[18] Bao Lei,Sun Zhi-rong.Identifying genes related to drug anticancer mechanisms using support vector machine [J].FEBS Letters,2002,521(1-3):109-114
[19] Bhardwaj N,Langlois R E,Zhao Gui-jun,et al.Kernel-basedmachine learning protocol for predicting DNA-binding proteins [J].Nucleic Acids Research,2005,33(20):6486-6493
[20] Cai C Z,Han L Y,Ji Z L,et al.Enzyme family classification by support vector machines [J].Proteins,2004,55(1):66-76
[21] Han L,Cui J,Lin H,et al.Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity [J].Proteomics,2006,6(14):4023-4037
[22] Knox C,Law V,Jewison T,et al.DrugBank 3.0:A comprehensive resourcefor ‘omics’ research on drugs [J].Nucleic Acids Research,2011,39:1035-1041
[23] Bakheet T M,Doig A J.Properties and identification of human protein drug targets[J].Bioinformatics,2009,25(4):451-457
[24] 刘明吉,王秀峰,黄亚楼.数据挖掘中的数据预处理[J].计算机科学,2000,4(27):54-57
[25] Nan Xiao,Cao Dong-sheng,Xu Qing-song,et al.protr:Protein Sequence Feature Extraction with R.http://CRAN.R-project.org/package=protr
[26] 涂白,毕然.支持向量机方法预测离子通道白[J].计算机与数字工程,2007,5(10):8-10

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!