计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 127-134.doi: 10.11896/jsjkx.200700068

• 人工智能* 上一篇    下一篇

基于图卷积神经网络的药物靶标作用关系预测方法

高创1, 李建华1,2, 季秀怡1, 朱程龙1, 李诗良2, 李洪林2   

  1. 1 华东理工大学信息科学与工程学院 上海200237
    2 上海市新药设计重点实验室 上海200237
  • 收稿日期:2020-07-10 修回日期:2020-08-16 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 李建华(jhli@ecust.edu.cn)
  • 作者简介:gaochuang0814@163.com
  • 基金资助:
    国家重点研发计划项目(2016YFA0502304);国家重大新药创制项目(2018ZX09735002)

Drug Target Interaction Prediction Method Based on Graph Convolutional Neural Network

GAO Chuang1, LI Jian-hua1,2, JI Xiu-yi1, ZHU Cheng-long1, LI Shi-liang2, LI Hong-lin2   

  1. 1 College of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
    2 Shanghai Key Laboratory of New Drug Design,Shanghai 200237,China
  • Received:2020-07-10 Revised:2020-08-16 Online:2021-10-15 Published:2021-10-18
  • About author:GAO Chuang,born in 1995,master.His main research interests include graph convolutional neural network and re-commendation system.
    LI Jian-hua,born in 1977,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include computer drug design and data mining.
  • Supported by:
    National Key R&D Program of China(2016YFA0502304) and National Major Scientific and Technological Special Project for “Significant New Drugs Development”(2018ZX09735002).

摘要: 药物-靶标作用关系预测在药物研发以及药物重定位中扮演着重要角色,但现有的机器学习方法在正负样本高度不平衡的数据上仍存在预测能力不足的问题。为此,提出一种基于图卷积神经网络的药物靶标作用关系预测方法。该方法首先构造一个结合多种药物(靶标)相关信息的异质信息网络,然后采用图卷积神经网络在此异质信息网络上学习得到能精确表达每个节点拓扑特征及邻居特征信息的低维向量表征,最后利用这些向量信息通过向量空间投影预测节点间概率的评分。在DrugBank_FDA和Yammanishi_08数据集上进行的药物-靶标作用关系预测的对比实验中,所提方法的AUPR(Area Under the Precision-Recall Curve)值都优于其他4种方法,并且在较大型数据集上也有较好的表现。实验结果表明,所提方法提高了样本高度不平衡时的药物-靶标作用关系预测性能;同时在生物药物数据库上的实验也验证了所提方法所发现的未知药物-靶标作用关系的有效性。

关键词: 机器学习, 图卷积神经网络, 向量表征, 药物-靶标作用关系, 异质信息网络

Abstract: Drug-target interaction prediction plays an important role in drug discovery and repositioning.However,existing prediction methods have the problem of insufficient predictive performance while processing data with highly unbalance positive and negative samples.Therefore,a novel computational method based on graph convolutional neural network(GCN) for predicting drug-target interactions is proposed.In this method,a heterogeneous information network is constructed,which integrates diverse drug-related information and target-related information.From the heterogeneous information network,low-dimensional vector representation of features,which accurately explains the topological properties of individual and neighborhood feature information,is learned by using GCN and then prediction is made based on these representations via a vector space projection scheme.The AUPR(Area Under the Precision-Recall Curve) values of the proposed method outperforms other four existing methods in the prediction of drug-target interaction on both DrugBank_FDA and Yammanishi_08 datasets,and it preforms well on bigger datasets.The experimental results indicate that the proposed method improves the prediction performance of drug-target interaction on datasets with highly unbalanced samples.Furthermore,we validate novel(unknown) drug-target interactions which are predicted by GCN in biomedical databases.

Key words: Drug-target interactions, Graph convolutional neural networks, Heterogeneous information network, Machine learning, Vector representation

中图分类号: 

  • TP391
[1]NOVAC N.Challenges and opportunities of drug repositioning[J].Trends in Pharmacological Sciences,2013,34(5):267-272.
[2]EZZAT A,WU M,LI L X,et al.Computational prediction of drug-target interactions using chemogenomic approaches:an empirical survey[J].Brief Bioinform,2019,20(4):1337-1357.
[3]CHENG A,COLEMAN R,SMYTH K,et al.Structure-basedmaximal affinity model predicts small-molecule druggability[J].Nature Biotechnology,2007,25(1):71-75.
[4]MUHAMMAD S,FATIMA N.In silico analysis and molecular docking studies of potential angiotensin-converting enzyme inhibitor using quercetin glycosides[J].Pharmacognosy Magazine,2015,11(Suppl 1):S123.
[5]KEISER M,ROTH B,ARMBRUSTER B,et al.Relating protein pharmacology by ltigand chemistry[J].Nature Biotechnology,2007,25(2):197.
[6]LUO YN,ZHAO X B,ZHOU J T,et al.A network integration approach for drug-target interaction prediction and computatio-nal drug repositioning from heterogeneous information[J/OL].Nature Communications.https://www.biorxiv.org/content/biorxiv/early/2017/01/13/100305.full.pdf.
[7]XU B B,CEN K T,HUANG J J,et al.A Survey on Graph Convolutional Neural NetWork[J].Chinese Journal of Computers,2020,43(5):755-780.
[8]SCARSELLI F,GORI M,TSOI A,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80.
[9]KIPF T N,WELLING M.Semi-supervised classification with graph convolutional networks[J].arXiv:1609.02907,2016.
[10]BLEAKLEY K,YAMANISHI Y.Supervised prediction of drug-target interactions using bipartite local models[J].Bioinforma-tics,2009,25(18):2397-2403.
[11]MEI J P,KWOH C K,YANG P,et al.Drug-target interaction prediction by learning from local information and neighbors[J].Bioinformatics,2013,29(2):238-245.
[12]XIA Z,WU L Y,ZHOU X B,et al.Semi-supervised drug-protein interaction prediction from heterogeneous biological space[J].BMC Systems Biology,2010,4:S2-S6.
[13]LIU Y,WU M,MIAO C Y,et al.Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction[J].PLOS Computational Biology,2016,12(2).
[14]HAO M,BRYANT S,WANG Y L.Predicting drug-target inte-ractions by dual-network integrated logistic matrix factorization[J].Scientific Reports,2017,7(1).
[15]OLAYAN R,ASHOOR H,BAJIC V,et al.DDR:efficient computational method to predict drug-target interactions using graph mining and machine learning approaches[J].Bioinforma-tics,2018,34(7):1164-1173.
[16]MOHAMED S K,NOVÁEK V,NOUNU A.Discovering protein drug targets using knowledge graph embeddings[J].Bioinfomatics,2020,36(2):603-610.
[17]HU W H,LIU B W,GOMES J,et al.Pre-training Graph Neural Networks[J].arXiv:1905,12265,2019.
[18]ZHOU J,CUI G Q,ZHANG Z Y,et al.Graph neural networks:A review of methods and applications[J].arXiv:1812.08434,2018.
[19]HAMILTON W,YING Z T,LESKOVES J.Inductive representation learning on large graph[C]//Advance in Neural Information Processing System.2017:1024-1034.
[20]BRUNA J,ZARAMBA W,SZLAM A,et al.Spectral networks and locally connected networks on graphs[J].arXiv:1312.6203,2013.
[21]DEFFERRAR M,BRESSON X,VANDERGHETNS P.Convolutional neural networks on graphs with fast localized spectral filtering[C]//Advance in Neural Information Processing System.2016:3844-3852.
[22]CHEN J,MA T F,XIAO C.Fastgcn:fast learning with graph convolutional networks via importance sampling[J].arXiv:1801.10247,2018.
[23]ZITNIK M,AGRAWAL M,LESKOVEC J.Modeling polypharmacy side effects with graph convolutional networks[J].Bioinformatics,2018,34(13):i457-i466.
[24]MIKOLOV T.Distributed Representations of Words and Phrases and their Compositionality[J].Advances in Neural Information Processing Systems,2013,26:3111-3119.
[25]WAN F P,HONG L X,XIAO A,et al.NeoDTI:neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions[J].Bioinformatics,2018,35(1):104-111.
[26]YAMANISH Y,ARAKI M,GUTTERIDG A,et al.Prediction of drug-target interaction networks from the integration of chemical and genomic spaces[J].Bioinformatics,2008,24(13):i232-i240.
[27]KANEHISA M,GOTO S,HATTOR M,et al.From genomics to chemical genomics:new developments in KEGG[J].Nucleic acids research,2006,34(suppl_1):D354-D357.
[28]SCHOMBURG I,CHANG A,EBELING C,et al.BRENDA,the enzyme database:updates and major new developments[J].Nucleic acids research,2004,32(suppl_1):D431-D433.
[29]GUNTHER S,KUHN M,DUNKEL M,et al.SuperTarget and Matador:resources for exploring drug-target relationships[J].Nucleic Acids Res,2008,36(Database issue):D919-D922.
[30]WISHART D,KNOX C,GUO A C,et al.DrugBank:a know-ledgebase for drugs,drug actions and drug targets[J].Nucleic Acids Research,2008,36(Database issue):D901-D906.
[31]WANG B,MEZLINI A,DEMIR F,et al.Similarity network fusion for aggregating data types on a genomic scale[J].Nature Methods,2014,11(3):333-337.
[32]YU D H,GUO M Z,LIU X Y,et al.Predicted Results Evaluation and Query Verification of Drug-Target Interaction[J].Journal of Computer Research and Development,2019,56(9):1881-1888.
[33]LI Q M,HAN Z C,WU X M.Deeper insights into graph convolutional networks for semi-supervisedlearning[C]//Thirty-Se-cond AAAI Conference on Artificial Intelligence.2018.
[34]LI G H,MATTHIAS M,ALI T,et al.DeepGCNs:Can GCNs go as deep as CNNs?[C]//Proceedings of the IEEE International Conference on Computer Vision.2019.
[35]KINGMA D,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[36]DAVIS A,GRONDIN C,JOHNSON R,et al.The comparative toxicogenomics database:update 2017[J].Nucleic Acids Research,2016,45(D1):D972-D978.
[37]WISHART D,ARNDT D,PON A,et al.T3DB:the toxic exposome database[J].Nucleic Acids Research,2015,43(Database issue):D928-D934.
[38]GAULTON A,BELLIS L,BENTO A,et al.ChEMBL:a large-scale bioactivity database for drug discovery[J].Nucleic Acids Research,2012,40(Database issue):D1100-D1107.
[39]DISATNIK M H,SHAINBER A.Regulation of beta-adrenoceptors by thyroid hormone and amiodarone in rat myocardiac cells in culture[J].Biochemical Pharmacology,1991,41(6/7):1039-1044.
[1] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3] 吕晓锋, 赵书良, 高恒达, 武永亮, 张宝奇.
基于异质信息网的短文本特征扩充方法
Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network
计算机科学, 2022, 49(9): 92-100. https://doi.org/10.11896/jsjkx.210700241
[4] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[5] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[6] 张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究
Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[7] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[8] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[9] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[10] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[11] 李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述
Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[12] 赵璐, 袁立明, 郝琨.
多示例学习算法综述
Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[13] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[14] 肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究
Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[15] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!