Computer Science ›› 2020, Vol. 47 ›› Issue (1): 59-65.doi: 10.11896/jsjkx.181202395

• Computer Science Theory • Previous Articles     Next Articles

Contact Map-based Residue-pair Distances Restrained Protein Structure Prediction Algorithm

XIE Teng-yu1,ZHOU Xiao-gen2,HU Jun1,ZHANG Gui-jun1   

  1. (College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China)1;
    (Department of Computational Medicine and Bioinformatics,University of Michigan,Ann Arbor,MI 45108,USA)2
  • Received:2018-12-24 Published:2020-01-19
  • About author:XIE Teng-yu,born in 1993,postgra-duate.Her main research interests include intelligent information processing,optimization theory and algorithm design and bioinformatics;ZHANG Gui-jun,born in 1974,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation (CCF).His main research interests include intelligent information processing,optimization theory and algorithm design and bioinformatics.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61773346) and Zhejiang Provincial Science Foundation of China (LZ20F030002).

Abstract: De novo prediction is an important method for protein structure modeling.Research of the method contributes to humanity’s understanding of protein functions to conduct drug design and disease treatments.In order to improve the accuracy of prediction,contact map-based residue-pair distances restrained protein structure prediction algorithm (CDPSP) was proposed.Based on the framework of evolutionary algorithm,CDPSP was used to sample conformational space,which was divided into exploration and exploitation stages.In the exploration stage,the strategies of mutation and selection were designed on the basis of the distances of residue-pair,which can increase the diversity of the population.In detail,a residue-pair was chosen according to the contact probability of contact map and the mutation was conducted through fragment assembly technique on the adjacent region of the residue-pair.The selection of mutated conformation was determined by the expected probability distributed through the discretization of residue-pair distances.In the exploitation stage,the contact-based score and energy function were used to evaluate the quality of conformations in search of good conformations,which can enhance the sampling ability of CDPSP in near-native region.In order to verify the performance of the proposed algorithm,CDPSP is tested on 10 targets in the FM group of CASP12 and compared with advanced algorithms.The test results show that CDPSP can predict more accurate protein tertiary structure models.

Key words: Contact map, De novo prediction, Distances of residue-pair, Evolutionary algorithm, Fragment assembly, Protein structure prediction

CLC Number: 

  • TP301.6
[1]KOLATA G.Trying to crack the second half of the genetic code [J].Science,1986,233:1037-1040.
[2]WANG C,ZHU J W,ZHANG H C,et al.A Survey on Algorithms for Protein Tertiary Structure Prediction[J].Chinese Journal of Computers,2018,41(4):760-779.
[3]DENG H Y,JIA Y,ZHANG Y.Protein structure prediction
[J].Acta Physica Sinica,2016,65(17):169-179.
[4]MA B G.Protein Folding Prediction[J].Chinese Science Bulletin,2016,61(24):2670-2680.
[5]DILL K A,MACCALLUM J L.The protein-folding problem,50 years on [J].Science,2012,338(6110):1042-1046.
[6]ZHANG Y.Protein structure prediction:when is it useful? [J].Current Opinion in Structural Biology,2009,19(2):145-155.
[7]MOULT J,FIDELIS K,KRYSHTAFOVYCH A,et al.Critical assessment of methods of protein structure prediction (CASP)-Round XII [J].Proteins:Structure,Function,and Bioinforma-tics,2018,86 (Suppl 1):7-15.
[8]MOULT J,FIDELIS K,KRYSHTAFOVYCH A,et al.Critical assessment of methods of protein structure prediction:Progress and new directions in round XI [J].Proteins:Structure,Function,and Bioinformatics,2016,84(Suppl 1):4-14.
[9]MOULT J,FIDELIS K,KRYSHTAFOVYCH A,et al.Critical assessment of methods of protein structure prediction (CASP)--round x [J].Proteins:Structure,Function,and Bioinformatics,2014,82( Suppl 2):1-6.
[10]KEASAR C,MCGUFFIN L J,WALLNER B,et al.An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12 [J].Scientific Reports,2018,8(1):9939.
[11]ANFINSEN C B,HABER E,SELA M,et al.The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain [J].Proceedings of the National Academy of Sciences,1961,47(9):1309-1314.
[12]LEE J,FREDDOLINO P L,ZHANG Y.Ab Initio Protein Structure Prediction[M]∥From Protein Structure to Function with Bioinformatics.Netherlands,Dordrecht:Springer,2017:3-35.
[13]BRADLEY P,MISURA K M,BAKER D.Toward high-resolution de novo structure prediction for small proteins [J].Science,2005,309(5742):1868-1871.
[14]KIM D E,BLUM B,BRADLEY P,et al.Sampling bottlenecks in de novo protein structure prediction [J].Journal of Molecular Biology,2009,393(1):249-260.
[15]LI Z,SCHERAGA H A.Monte Carlo-minimization approach to the multiple-minima problem in protein folding [J].Proceedings of the National Academy of Sciences,1987,84(19):6611-6615.
[16]KIHARA D,LU H,KOLINSKI A,et al.TOUCHSTONE:an ab initio protein structure prediction method that uses threading-based tertiary restraints [J].Proceedings of the National Academy of Sciences,2001,98(18):10125-10130.
[17]LEE J.New Monte Carlo algorithm:Entropic sampling [J]. Physical Review Letters,1993,71(2):211-214.
[18]PIANA S,LINDORFF-LARSEN K,SHAW D E.Atomic-level description of ubiquitin folding [J].Proceedings of the National Academy of Sciences,2013,110(15):5915-5920.
[19]LINDORFF-LARSEN K,MARAGAKIS P,PIANA S,et al.Picosecond to Millisecond Structural Dynamics in Human Ubiqui-tin[J].Journal of Physical Chemistry B,2016,120(33):8313-8320.
[20]PEARLMAN D A,CASE D A,CALDWELL J W,et al.Amber,a Package of Computer-Programs for Applying Molecular Mechanics,Normal-Mode Analysis,Molecular-Dynamics and Free-Energy Calculations to Simulate the Structural and Energetic Properties of Molecules [J].Computer Physics Communications,1995,91(1/2/3):1-41.
[21]CLAUSEN R,SHEHU A.A multiscale hybrid evolutionary algorithm to obtain sample-based representations of multi-basin protein energy landscapes[C]∥Proceedings of the 5th ACM Conference on Bioinformatics,Computational Biology,and Health Informatics.ACM,2014:269-278.
[22]GARZA-FABRE M,KANDATHIL S M,HANDL J,et al.Generating,Maintaining,and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction [J].Evolutionary Computation,2016,24(4):577-607.
[23]HAO X H,ZHANG G J,ZHOU X G.Conformational Space Sampling Method Using Multi-Subpopulation Differential Evolution for De novo Protein Structure Prediction [J].IEEE Transactions on Nanobioscience,2017,16(7):618-633.
[24]HAO X H,ZHANG G J,ZHOU X G,et al.A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space [J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2016,13(5):887-900.
[25]HAO X H,ZHANG G J,ZHOU X G.Guiding exploration in conformational feature space with Lipschitz underestimation for ab-initio protein structure prediction [J].Computational Biology and Chemistry,2018,73:105-119.
[26]ZHOU X G,ZHANG G J.Differential Evolution With Underestimation-Based Multimutation Strategy [J].IEEE Transactions on Cybernetics,2018,PP(99):1-12.
[27]ZHANG G J,ZHOU X G,YU X F,et al.Enhancing Protein Conformational Space Sampling Using Distance Profile-Guided Differential Evolution [J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2017,14(6):1288-1301.
[28]ZHOU X G,ZHANG G J,HAO X H,et al.Enhanced differen- tial evolution using local Lipschitz underestimate strategy for computationally expensive optimization problems [J].Applied Soft Computing,2016,48:169-181.
[29]ZHOU X G,ZHANG G J.Abstract Convex Underestimation Assisted Multistage Differential Evolution [J].IEEE Transactions on Cybernetics,2017,47(9):2730-2741.
[30]ZHOU X G,ZHANG G J,HAO X H,et al.A novel differential evolution algorithm using local abstract convex underestimate strategy for global optimization [J].Computers & Operations Research,2016,75(11):132-149.
[31]RAKHSHANI H,IDOUMGHAR L,LEPAGNOT J,et al. Speed up differential evolution for computationally expensive protein structure prediction problems[J/OL].
[32]LEE J,SCHERAGA H A,RACKOVSKY S.New optimization method for conformational energy calculations on polypeptides:Conformational space annealing [J].Journal of Computational Chemistry,1997,18(9):1222-1232.
[33]ZHANG Y.Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 [J].Proteins:Structure,Function,and Bioinformatics,2014,82:175-187.
[34]LI Z W,HAO X H,ZHANG G J.Replica Exchange Based Local Enhanced Differential Evolution Searching Method in Ab-initio Protein Structure Prediction[J].Computer Science,2017,44(5):211-217.
[35]SHEHU A,OLSON B.Guiding the Search for Native-like Protein Conformations with an Ab-initio Tree-based Exploration [J].International Journal of Robotics Research,2010,29(8):1106-1127.
[36]ROY A,KUCUKURAL A,ZHANG Y.I-TASSER:a unified platform for automated protein structure and function prediction [J].Nature Protocols,2010,5(4):725-738.
[37]ROHL C A,STRAUSS C E M,MISURA K M S,et al.Protein structure prediction using rosetta [J].Methods in Enzymology,2004,383:66-93.
[38]HAO X H,ZHANG G J,ZHOU X G,et al.Protein Conformational Space Optimization Algorithm Based on Fragment-assembly[J].Computer Science,2015,42(3):237-240.
[39]XU D,ZHANG Y.Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field [J].Proteins:Structure,Function,and Bioinforma-tics,2012,80(7):1715-1735.
[40]KC D B.Recent advances in sequence-based protein structure prediction [J].Briefings in bioinformatics,2016,18(6):1021-1032.
[41]ZHANG C,MORTUZA S M,HE B,et al.Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12 [J].Proteins:Structure,Function,and Bioinformatics,2018,86( Suppl 1):136-151.
[42]ZHANG W X,YANG J Y,HE B J,et al.Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11 [J].Proteins:Structure,Function,and Bioinformatics,2016,84:76-86.
[43]OVCHINNIKOV S,PARK H,KIM D E,et al.Protein structure prediction using Rosetta in CASP12 [J].Proteins:Structure,Function,and Bioinformatics,2018,86( Suppl 1):113-121.
[44]MOULT J,FIDELIS K,KRYSHTAFOVYCH A,et al.Critical assessment of methods of protein structure prediction (CASP)Round XII [J].Proteins:Structure,Function,and Bioinformatics 2018,86:7-15.
[45]ADHIKARI B,BHATTACHARYA D,CAO R,et al.CON- FOLD:Residue-residue contact-guided ab initio protein folding [J].Proteins:Structure,Function,and Bioinformatics,2015,83(8):1436-1449.
[46]JONES D T.Predicting novel protein folds by using FRAGFOLD [J].Proteins:Structure,Function,and Bioinformatics,2001( Suppl 5):127-132.
[47]DE OLIVEIRA S H P,DEANE C M.Combining co-evolution and secondary structure prediction to improve fragment library generation [J].Bioinformatics,2018,34(13):2219-2227.
[48]ZHANG G J,MA L F,WANG X Q,et al.Secondary Structure and Contact Guided Differential Evolution for Protein Structure Prediction [J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2018,PP(99):1-1.
[49]DeepMind.AlphaFold:Using AI for scientific discovery[EB/ OL].(2018-12-02).
[50]JI S,ORUC T,MEAD L,et al.DeepCDpred:Inter-residue Distance and Contact Prediction for Improved Prediction of Protein Structure [J].PLOS ONE,2019,14(1):e0205214.
[51]WANG S,LI W,ZHANG R,et al.CoinFold:a web server for protein contact prediction and contact-assisted protein folding [J].Nucleic Acids Research,2016,44(W1):W361-W366.
[52]WANG S,SUN S Q,LI Z,et al.Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model [J].Plos Computational Biology,2017,13(1):e1005324.
[53]MA J Z,WANG S,WANG Z Y,et al.Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning [J].Bioinformatics,2015,31(21):3506-3513.
[54]ABRIATA L A,TAMO G E,MONASTYRSKYY B,et al.Assessment of hard target modeling in CASP12 reveals an emerging role of alignment-based contact prediction methods [J].Proteins:Structure,Function,and Bioinformatics,2018,86:97-112.
[55]ZHANG Y,SKOLNICK J.SPICKER:A clustering approach to identify near-native protein folds [J].Journal of Computational Chemistry,2004,25(6):865-871.
[56]CHIVIAN D,KIM D E,MALMSTROM L,et al.Automated prediction of CASP-5 structures using the Robetta server [J].Proteins:Structure,Function,and Bioinformatics,2003,53:524-533.
[57]ZHANG Y,SKOLNICK J.Scoring function for automated assessment of protein structure template quality [J].Proteins:Structure,Function,and Bioinformatics,2004,57(4):702-710.
[58]XU J,ZHANG Y.How significant is a protein structure similari- ty with TM-score= 0.5? [J].Bioinformatics,2010,26(7):889-895.
[1] SUN Gang, WU Jiang-jiang, CHEN Hao, LI Jun, XU Shi-yuan. Hidden Preference-based Multi-objective Evolutionary Algorithm Based on Chebyshev Distance [J]. Computer Science, 2022, 49(6): 297-304.
[2] LI Li, LI Guang-peng, CHANG Liang, GU Tian-long. Survey of Constrained Evolutionary Algorithms and Their Applications [J]. Computer Science, 2021, 48(4): 1-13.
[3] ZHOU Sheng-yi, ZENG Hong-wei. Program Complexity Analysis Method Combining Evolutionary Algorithm with Symbolic Execution [J]. Computer Science, 2021, 48(12): 107-116.
[4] ZHAO Yang, NI Zhi-wei, ZHU Xu-hui, LIU Hao, RAN Jia-min. Multi-worker and Multi-task Path Planning Based on Improved Lion Evolutionary Algorithm forSpatial Crowdsourcing Platform [J]. Computer Science, 2021, 48(11A): 30-38.
[5] ZHU Han-qing, MA Wu-bin, ZHOU Hao-hao, WU Ya-hui, HUANG Hong-bin. Microservices User Requests Allocation Strategy Based on Improved Multi-objective Evolutionary Algorithms [J]. Computer Science, 2021, 48(10): 343-350.
[6] ZHANG Qing-qi, LIU Man-dan. Multi-objective Five-elements Cycle Optimization Algorithm for Complex Network Community Discovery [J]. Computer Science, 2020, 47(8): 284-290.
[7] LI Zhang-wei, XIAO Lu-qian, HAO Xiao-hu, ZHOU Xiao-gen, ZHANG Gui-jun. Multimodal Optimization Algorithm for Protein Conformation Space [J]. Computer Science, 2020, 47(7): 161-165.
[8] CHEN Meng-hui, CAO Qian-feng and LAN Yan-qi. Heuristic Algorithm Based on Block Mining and Recombination for Permutation Flow-shop Scheduling Problem [J]. Computer Science, 2020, 47(6A): 108-113.
[9] YANG Hao, CHEN HONG-mei. Mixed-sampling Method for Imbalanced Data Based on Quantum Evolutionary Algorithm [J]. Computer Science, 2020, 47(11): 88-94.
[10] LI Zhang-wei, HAO Xiao-hu, ZHANG Gui-jun. Multi-layer Screening Based Evolution Algorithm for De Novo Protein Structure Prediction [J]. Computer Science, 2019, 46(6A): 80-84.
[11] GENG Huan-tong, HAN Wei-min, ZHOU Shan-sheng, DING Yang-yang. MOEA/D Algorithm Based on New Neighborhood Updating Strategy [J]. Computer Science, 2019, 46(5): 191-197.
[12] JIN Ting, TAN Wen-an, SUN Yong, ZHAO Yao. Social Team Formation Method Based on Fuzzy Multi-objective Evolution [J]. Computer Science, 2019, 46(2): 315-320.
[13] LIU Xin-ping, GU Chun-hua, LUO Fei, DING Wei-chao. Improved NSGA-II Algorithm Based on Loser Group and Hybrid Coding Strategy [J]. Computer Science, 2019, 46(10): 222-228.
[14] LAI Wen-xing, DENG Zhong-min. Improved NSGA2 Algorithm Based on Dominant Strength [J]. Computer Science, 2018, 45(6): 187-192.
[15] CHEN Jin-yin, XIONG Hui, ZHENG Hai-bin. Parameters Optimization for SVM Based on Particle Swarm Algorithm [J]. Computer Science, 2018, 45(6): 197-203.
Full text



No Suggested Reading articles found!