计算机科学 ›› 2016, Vol. 43 ›› Issue (Z11): 16-20.doi: 10.11896/j.issn.1002-137X.2016.11A.004

• 智能计算 • 上一篇    下一篇

基于PPI网络的关键蛋白质的高效预测算法

洪海燕,刘维   

  1. 扬州大学信息工程学院 扬州225127,扬州大学信息工程学院 扬州225127;江苏省动物重要疫病与人畜共患病防控协同创新中心 扬州225127
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61379066,7,61379064,61472344),江苏省自然科学基金(BK20130452,BK2012672,BK2012128),江苏省高校自然科学基金(12KJB520019,3KJB520026)资助

Efficient Prediction Method of Essential Proteins Based on PPI Network

HONG Hai-yan and LIU Wei   

  • Online:2018-12-01 Published:2018-12-01

摘要: 关键蛋白质对于细胞生活是不可缺少的,识别关键蛋白质可以帮助了解细胞生活的最小需求,同时对药物设计也有非常大的作用。随着高通量技术的发展,人们可得到越来越多的蛋白质-蛋白质相互作用(PPI)的数据,这就使得可以在网络层次上来研究关键蛋白质。目前,学术界已经提出了一系列的计算方法来识别关键蛋白质,但这些方法并没有完全解决蛋白质相互作用数据的假阳性问题。除此之外,现有方法一般只考虑了网络的拓扑结构,对于生物信息的考虑,目前还是比较欠缺的。蛋白质对于人类细胞的生命活动不仅仅与网络拓扑结构有关,还和蛋白质在网络上的生物信息相关。因此,针对以上问题,提出了一种高效的预测关键蛋白质的新方法EPP(Essential Proteins Predict),该方法通过计算蛋白质在PPI网络中的重要性来进行预测,蛋白质的重要性越高,成为关键蛋白质的可能性就越大。取重要性排名前P%的蛋白质作为关键蛋白质,在进行蛋白质重要性的计算时,综合考虑语义相似度及可信度因素,以综合考虑网络的拓扑结构与蛋白质本身的生物信息。实验结果表明,与其他传统方法相比,提出的新方法复杂度较低,且能够识别出更多的关键蛋白质,并且其统计指标也高于其他的方法。

关键词: 关键蛋白质,GO,语义相似度

Abstract: The essential protein is indispensable for cell life,so it is very helpful for us to understand the minimum requirements of cellular life and design the drug through identifying essential protein.With the development of high-throughput technologies,more and more protein- protein interactions (PPI) data has been obtained,which makes it possible to study essential protein from the network level.At present there are already a number of computational methods proposed for essential proteins identification,but these methods do not solve the PPI data false positive issues.In addition,existing methods generally just consider the topology of the network not considering biological information of protein on the network,and is still relatively lacking.Protein for human life activities of cells not only related to the topology of the network,but also related with protein biological information on the network.To solve the above problems,this paper presented an efficient new method to predict essential protein called EPP (Essential Proteins Predict).The algorithm predicts essential proteins through computing the importance score of protein in the PPI network,the higher importance score of protein is, the protein is more likely to be essential.We take the importance of rank P% of the protein as essential protein.When computing the importance score of essential protein,we synthetically considered the semantic similarity and credibility factors.Our method has low complexity,and considers not only the topology of the network but also the biological meaning of protein itself.Experimental results show that,compared with other conventional methods,our method can identify more essential protein,and its statistical indicators is also higher than other methods.

Key words: Essential protein,GO,Semantic similarity

[1] 刘舜民.基于蛋白质相互作用加权网络的关键蛋白质识别算法研究[D].长沙:湖南大学信息科学与工程学院,2013
[2] Li M,Lu Y,Niu Z,et al.United complex centrality for identification of essential proteins from PPI networks[C]∥2015 The Institute of Electrical and Electronics Engineers.Computational Biology and Bioinformatics:IEEE,2015
[3] Cullen L M,Arndt G M.Genome-wide screening for gene function using RNAi in mammalian cells[J].Immunol & Cell Biol,2005,83(3):217-223
[4] Wang J,Wei P,Wu F X.Computational approaches to predicting essential proteins:A survey[J].Proteomics Clinical Applications,2013,7(1/2):181-192
[5] Roemer T,Jiang B,Davison J,et al.Large-scale essential gene-identification in Candida albicans and applications to antifungal drug discovery[J].Molecular Microbiology,2003,50(1):167-181
[6] Jeong H,Mason S P,Barabásì A L,et al.Lethality and centrality in protein networks[J].Nature,2001,411(6833):41-42
[7] Joy M P,Brock A,Ingber D E,et al.High.betweenness proteins in the yeast protein interaction network[J].Journal of Biomedicine & Biotechnology,2005,2005(2):96-103
[8] Wuchty S,Stadler P F.Centers of complex networks[J].Journal of Theoretical Biology,2003,223(1):45-53
[9] Estrada E,Rodriguez-Veldzquez J A.Subgraph centrality in com-plex networks[J].Physical Review E statistical Nonlinear & Soft Matter Physics,2005,71(5):122-133
[10] Bonacich P.Power and centrality:A family of measures[J].American Journal of Sociology,1987,92(5):1170-1182
[11] Stephenson K,Zelen M.Rethinking centrality:Methods and examples[J].Social Networks,1989,11(1):1-37
[12] Plaimas K,Eils R,Knig R.Identifying essential genes in bacterial metabolic networks with machine learning methods[J].BMC systems biology,2010,4(1):56-72
[13] Lu Yu,Li Min,Li Qi,et al.A new method for predicting essential proteins based on topology potential[J].IEEE International Conference on Bioinformatics and Biomedicine,2013:109-114
[14] Tang Xi-wei,Wang Jian-xin,Zhong Jian-cheng,et al.Predicting Essential Proteins Based on Weighted Degree Centrality[J].IEEE/ACM Transactions on Computational Biology and B loinformatics,2014,11(2):407-418
[15] Li Min,Zheng Rui-qing,Zhang Han-hui,et al.Effective identification of essential proteins based on prbri knowledge,network topology and gen expressions[J].Methods,2014,7(3):325-333
[16] 吕琳媛.复杂网络链路预测[J].电子科技大学学报,2010,39(5):651-661
[17] Resnik P.Semantic similarity in a taxonomy:an information-based measure and its application to problems of ambiguity in natural language[J].Semantic Ambiguity & Underspecification,1995,6(1):159-201
[18] Lin D K.An information-theoretic definition of similarity[C]∥Proceedings of the 15th International Conference Machine Learning.New York:Association for Computing Machinery,1998,1:296-304
[19] Xu C L,Liang Y C,Guan Y L,et al.Turbo product codes for mobile multimedia broadcasting with partial-time jamming[J].IEEE Transactions on Broadcasting,2007,53(1):256-262
[20] Ashburner M,Ball C A,Blake J L,et al.The Gene Ontology Consortium[J].Nat Genet,2010,25:25-29
[21] Consortium U.The Universal Protein Resource (UniProt) in 2010[J].Nucleic Acids Research,2010,38(1):142-148
[22] Xenarios L, Salwinski X J, Duan P, et al.DIP, the Database of Interacting Proteins:a research tool for studying cellular networks of protein interactions[J].Nucleic Acids Res,2002,30(1):303-305
[23] Mewes H W,Frishman D,Mayer K F X,et al.MIPS:analysis and annotation of proteins from whole genomes in 2005[J].Nucleic Acids Res,2006,34(1):169-172
[24] Cherry J M, Adler C, Ball C, et al.SGD:Saccharomyces Genome Database[J].Nucleic Acids Res,1998,26(1):73-79
[25] Zhang R,Lin Y.DEG 5.0,a database of essential genes in both prokaryotes and eukaryotes[J].Nucleic Acids Res,2009,37(1):455-458
[26] Saccharomyces Genome Deletion Project.http://www.sequence.stanford.edu/group
[27] Jeong H, Mason S P, Barabasi A L, et al.Lethality and centrality in protein networks[J].Nature,2001,411(6833):41-42
[28] Wuchty S,Stadler P F.Centers of complex networks[J].Journal of Theoretical Biology,2003,223(1):45-53
[29] Lin C Y,Chin C H,Wu H H,et al.Hubba:hub objects Analyzer-a framework of interactome hubs identification for network biology[J].Nucleic Acids Res,2008,36(2):438-443
[30] Li M,Wang J X,Chen X,et al.A local average connectivity-based method for identifying essential proteins from the network level[J].Computational Biology & Chemistry,2011,35(3):143-150

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!