Computer Science ›› 2021, Vol. 48 ›› Issue (9): 251-256.doi: 10.11896/jsjkx.200700066

• Artificial Intelligence • Previous Articles     Next Articles

Predicting Drug Molecular Properties Based on Ensembling Neural Networks Models

XIE Liang-xu1,2, LI Feng3, XIE Jian-ping4, XU Xiao-jun1   

  1. 1 Institute of Bioinformatics, Medical Engineering, School of Electrical, Information Engineering, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
    2 Jiangsu Sino-Israel Industrial Technology Research Institute,Changzhou,Jiangsu 213100,China
    3 School of Electrical and Information Engineering,Jiangsu University of Technology,Changzhou,Jiangsu 213001,China4 School of Science,Huzhou University,Huzhou,Zhejiang 313000,China
  • Received:2020-07-10 Revised:2020-10-20 Online:2021-09-15 Published:2021-09-10
  • About author:XIE Liang-xu,born in 1987,postgra-duate,associate professor,is a member of China Computer Federation.His main research interest includes AI aided drug design and data mining.
    XU Xiao-jun,born in 1979,professor,Jiangsu distinguished professor.His main research interest includes computational biophysics and AI aided biomolecules structure prediction.
  • Supported by:
    National Natural Science Foundation of China(12074151,22003020),Natural Science Foundation of Jiangsu Province,China(BK20191032),Changzhou Sci & Tech Program(CJ20200045) and Funding from Jiangsu Sino-Israel Industrial Technology Research Institute(JSIITRI202009)

Abstract: Artificial intelligence (AI) methods have made great success in predicting chemical properties and bioactivity of drug molecules in the Bioinformatics field.Neural network gains wide applications in the process of drug discovery.However,the shallow neural network (SNN) gives lower accuracy while deep neural networks (DNN) are easy to be overfitting.Model ensembling is expected to further improve the predictive performance of weak learners in traditional machine learning methods.Therefore,it is the first time to apply model ensembling strategy to predict the properties of drug molecules.By encoding molecular structures,the combination strategies,averaging,and stacking methods are adopted to increase predicting accuracy of pKa of drug molecules.Compared with DNN,the stacking strategy presents the best predictive accuracy and the Pearson coefficient reaches to 0.86.Ensembling weak learners of the neural networks can reproduce the accuracy of DNN while keeping the satisfied generalization ability.The results show that ensembling method can increase the predictive accuracy and reliability.

Key words: Bioinformatics, Computer aided drug discovery, Deep learning, Machine learning, Model ensembling

CLC Number: 

  • TP183
[1]DANISHUDDI N,KHAN A U.Descriptors and their selection methods in QSAR analysis:paradigm for drug design[J].Drug Discovery Today,2016,21(8):1291-1302.
[2]CHERKASOV A,MURATOV E N,FOURCHES D,et al.QSAR modeling:Where have you been? Where are you going to?[J].Journal of Medicinal Chemistry,2014,57(12):4977-5010.
[3]SUN Z,LU C,SHI Z,et al.Reasearch and advances on deep learning[J].Computer Science,2016,43(2):1-8.
[4]TIAN Q,WANG M.Research progress on deep learning algorithms.Computer Engineering and Applications[J].2019,55(22):25-33.
[5]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[6]CHAN H C S,SHAN H,DAHOUN T,et al.Advancing drug discovery via artificial intelligence[J].Trends in Pharmacological Sciences,2019,40(8):592-604.
[7]SHI X Y,YU L,TIAN S.et al.Research on calssification of oral bioavailability based on deep learning[J].Computer Science,2016,43(4):260-263.
[8]SHEN C,DING J,WANG Z,et al.From machine learning todeep learning:Advances in scoring functions for protein-ligand docking[J].WIREs Computational Molecular Science,2020,10(1):e1429.
[9]SEGLER M H S,KOGEJ T,TYRCHAN C,et al.Generating focused molecule libraries for drug discovery with recurrent neural networks[J].ACS Central Science,2018,4(1):120-131.
[10]SMITH J S,ROITBERG A E,ISAYEV O.Transforming computational drug discovery with machine learning and AI[J].ACS Medicinal Chemistry Letters,2018,9(11):1065-1069.
[11]XU Y,YAO H,LIN K.An overview of neural networks for drug discovery and the inputs used[J].Expert Opinion on Drug Discovery,2018,13(12):1091-1102.
[12]FEINBERG E N,JOSHI E,PANDE V S,et al.Improvement in ADMET prediction with multitask deep featurization[J].Journal of Medicinal Chemistry,2020,63(16):8835-8848.
[13]WENZEL J,MATTER H,SCHMIDT F.Predictive multitaskdeep neural network models for ADME-Tox properties:Lear-ning from large data sets[J].Journal of Chemical Information and Modeling,2019,59(3):1253-1268.
[14]LEI T,SUN H,KANG Y,et al.ADMET evaluation in drug discovery.18.Reliable prediction of chemical-induced urinary tract toxicity by boosting machine learning approaches[J].Molecular Pharmaceutics,2017,14(11):3935-3953.
[15]FU L,LIU L,YANG Z J,et al.Systematic modeling of logD7.4 based on ensemble machine learning,group contribution,and matched molecular pair analysis[J].Journal of Chemical Information and Modeling,2020,60(1):63-76.
[16]LIAO C,NICKLAUS M C.Comparison of nine programs predicting pKa values of pharmaceutical substances[J].Journal of Chemical Information and Modeling,2009,49(12):2801-2812.
[17]MANSOURI K,CARIELLO N F,KOROTCOV A,et al.Open-source QSAR models for pKa prediction using multiple machine learning approaches[J].Journal of Cheminformatics,2019,11(1):60.
[18]ZHOU Z H,WU J,TANG W.Ensembling neural networks:Many could be better than all[J].Artificial Intelligence,2002,137(1):239-263.
[19]MIN S,LEE B,YOON S.Deep learning in bioinformatics[J].Briefings in Bioinformatics,2016,18(5):851-869.
[20]WISHART D S,FEUNANG Y D,GUO A C,et al.DrugBank 5.0:a major update to the DrugBank database for 2018[J].Nucleic Acids Research,2018,46(D1):D1074-D1082.
[21]CHUANG K V,GUNSALUS L M,KEISER M J.Learning molecular representations for medicinal chemistry[J].Journal of Medicinal Chemistry,2020,63(16):8705-8722.
[22]DUAN J,DIXON S L,LOWRIE J F,et al.Analysis and compa-rison of 2D fingerprints:Insights into database screening performance using eight fingerprint methods[J].Journal of Mole-cular Graphics and Modelling,2010,29(2):157-170.
[23]LI L,KOH C C,REKER D,et al.Predicting protein-ligand interactions based on bow-pharmacological space and Bayesian additive regression trees[J].Scientific Reports,2019,9(1):7703.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214.
[3] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[4] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[5] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[6] LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[7] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[8] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[9] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[10] ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343.
[11] HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[12] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[13] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[14] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[15] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!