计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 40-45.doi: 10.11896/jsjkx.200200004
杨壮, 刘培强, 费兆杰, 刘畅
YANG Zhuang, LIU Pei-qiang, FEI Zhao-jie, LIU Chang
摘要: 关键蛋白质识别是当前计算生物学领域的一个研究热点和难点。通过计算方法识别关键蛋白质的方法主要有DC,BC,LAC,PeC,ION和LIDC等。现有方法的识别准确率还有待进一步提高,主要原因是其仅使用了蛋白质相互作用网络单一数据源,以及蛋白质相互作用网络中存在许多假阳性和假阴性数据等。为了提高识别准确率,提出一种高效识别方法PSHC。首先,PSHC方法首次把结构洞理论引入到关键蛋白质识别方法中;其次,融合了蛋白质相互作用网络和蛋白质复合物两种数据源用于识别关键蛋白质。在真实数据上的实验结果表明,与其他传统方法相比,PSHC方法可以识别更多关键蛋白质,并且敏感度、特异性、准确性、阳性预测值、阴性预测值、F测度等统计指标也明显高于其他方法。
中图分类号:
[1] PÁL C,PAPP B.Genomic function:Rate of evolution and gene dispensability[J].Nature,2003,421(6922):496-497. [2] CLATWORTHY A E,PIERSON E,HUNG D T.Targeting viru-lence:a new paradigm for antimicrobial therapy[J].Nat Chem Biol,2007,3(9):541-548. [3] LAMICHHANE G,ZIGNOL M,BLADES N J.A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis:Application to Mycobacterium tuberculosis[J].PNAS,2003,100(12):7213-7218. [4] STEINMETZ L M,SCHARFE C,DEUTSCHBAUER A M,et al.Systematic screen for human disease genes in yeast[J].Nat Genet,2002,31(4):400-404. [5] GIAEVER G,CHU A M,LI N.Functional profiling of theSaccharomyces cerevisiae genome[J].Nature,2002,418(6896):387. [6] ROEMER T,JIANG B,DAVISON J,et al.Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery[J].Mol Microbiol,2003,50(1):167-181. [7] CULLEN L M,ARNDT G M.Genome-wide screening for gene function using RNAi in mammalian cells[J].Immunol Cell Biol,2005,83(3):217-223. [8] ITO T,CHIBA T,OZAWA R,et al.A comprehensive two-hybrid analysis to explore the yeast protein interactome[J].Proceedings of the National academy of Sciences of the United States of America,2001,98(8):4569-4574. [9] AEBERSOLD R,MANN M.Mass spectrometry-based pro-teomics[J].Nature,2003,422(6928):198-207. [10] HO Y,GRUHLER A,BADER G D,et al.Systematic identi©cation of protein complexes in Saccharomyces cerevisiae by mass spectrometry[J].Nature,2002,415(6868):180-183. [11] H J,SP M,AL B.Lethality and centrality in protein networks[J].Nature,2001,411(6833):41-42. [12] HAHN M W,KERN A D.Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks[J].Mol Biol Evol,2005,22(4):803-806. [13] JOY M P,BROCK A,INGBER D E,et al.High-betweennessproteins in the yeast protein interaction network[J].J Biomed Biotechnol,2005,2005(2):96-103. [14] ESTRADA E,RODRIGUEZ-VELAZQUEZ J A.Subgraph centrality in complex networks[J].Phys Rev E Stat Nonlin Soft Matter Phys,2005,71(5 Pt 2):056103. [15] WUCHTY S,STADLER P F.Centers of complex networks[J].Journal of Theoretical Biology,2003,223(1):45-53. [16] STEPHENSON K,ZELEN M.Rethinking centrality:Methods and examples[J].Social Networks,1989,11(1):1-37. [17] BONACICH P.Power and Centrality:A Family of Measures[J].American Journal of Sociology,1987,92(5):1170-1182. [18] LI M,WANG J,CHEN X,et al.A local average connectivity-based method for identifying essential proteins from the network level[J].Comput Biol Chem,2011,35(3):143-150. [19] WANG J,LI M,WANG H,et al.Identification of essential proteins based on edge clustering coefficient[J].IEEE/ACM Trans Comput Biol Bioinform,2012,9(4):1070-1080. [20] QI Y,LUO J.Prediction of Essential Proteins Based on Local Interaction Density[J].IEEE/ACM Trans Comput Biol Bioinform,2016,13(6):1170-1182. [21] LI M,ZHANG H,WANG J X,et al.A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data[J].Bmc Systems Biology,2012,6(1):15. [22] ZHAO B,ZHAO Y,ZHANG X,et al.An iteration method foridentifying yeast essential proteins from heterogeneous network[J].BMC Bioinformatics,2019,20(1):355-368. [23] LUO J,QI Y.Identification of Essential Proteins Based on aNew Combination of Local Interaction Density and Protein Complexes[J].PLoS One,2015,10(6):e0131418. [24] QIN C,SUN Y,DONG Y.A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes[J].PLoS One,2016,11(8):e0161042. [25] ZHANG X,XIAO W,HU X.Predicting essential proteins by integrating orthology,gene expressions,and PPI networks[J].PLoS One,2018,13(4):e0195410. [26] LI M,LU Y,NIU Z,et al.United Complex Centrality for Identification of Essential Proteins from PPI Networks[J].IEEE/ACM Trans Comput Biol Bioinform,2017,14(2):370-380. [27] LEI X,YANG X.A new method for predicting essential proteins based on participation degree in protein complex and subgraph density[J].PLoS One,2018,13(6):e0198998. [28] LI M,LI W,WU F X,et al.Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information[J].Journal of Theoretical Biology,2018,447:65-73. [29] LEI X,YANG X,FUJITA H.Random walk based method to identify essential proteins by integrating network topology and biological characteristics[J].Knowledge-Based Systems,2019,167:53-67. [30] BURT R S.Structural Holes:The Social Structure of Competition[M].Harvard University Press,2009. [31] IOANNIS X,LUKASZ S,JOYCE D X.DIP,the database of interacting proteins:a research tool for studying cellular networks of protein interactions[J].Nucleic Acids Research,2002,30(1):303-305. [32] KROGAN N J,CAGNEY G,YU H,et al.Global landscape of protein complexes in the yeast Saccharomyces cerevisiae[J].Nature,2006,440(7084):637-643. [33] MEWES H W,DIETMANN S,FRISHMAN D,et al.MIPS:analysis and annotation of genome information in 2007[J].NucleicAcids Res,2008,36:196-201. [34] MICHAEL C J,CAROLINE A,CATHERINE B.SGD:saccharomyces genome database[J].Nucleic Acids Research,1998,26(1):73-79. [35] ZHANG R,LIN Y.DEG 5.0,a database of essential genes in both prokaryotes and eukaryotes[J].Nucleic Acids Res,2009,37:455-458. [36] DE MATTEIS G,GRAUDENZI A,ANTONIOTTI M.A review of spatial computational models for multi-cellular systems,with regard to intestinal crypts and colorectal cancer development[J].J Math Biol,2013,66(7):1409-1462. |
[1] | 刘文洋, 郭延哺, 李维华. 识别关键蛋白质的混合深度学习模型 Identifying Essential Proteins by Hybrid Deep Learning Model 计算机科学, 2021, 48(8): 240-245. https://doi.org/10.11896/jsjkx.200700076 |
[2] | 戴彩艳, 何菊, 胡孔法, 丁有伟, 李新霞. 基于衰减系数建立动态蛋白质网络模型进行关键蛋白质预测 Establishment of Dynamic Protein Network Model Based on Attenuation Coefficient for Key Protein Prediction 计算机科学, 2020, 47(6A): 29-33. https://doi.org/10.11896/JsJkx.190800071 |
[3] | 唐家琪, 吴璟莉, 廖元秀, 王金艳. 基于双加权投票的蛋白质功能预测 Prediction of Protein Functions Based on Bi-weighted Vote 计算机科学, 2019, 46(4): 222-227. https://doi.org/10.11896/j.issn.1002-137X.2019.04.035 |
[4] | 王珍,韩忠明,李晋. 大规模数据下的社交网络结构洞节点发现算法研究 Research on Social Network Structural Holes Discovery Algorithm under Large-scale Data 计算机科学, 2017, 44(4): 188-192. https://doi.org/10.11896/j.issn.1002-137X.2017.04.041 |
[5] | 洪海燕,刘维. 基于改进的PSO算法的关键蛋白质识别方法研究 Research on Essential Protein Identification Method Based on Improved PSO Algorithm 计算机科学, 2017, 44(10): 38-44. https://doi.org/10.11896/j.issn.1002-137X.2017.10.007 |
[6] | 洪海燕,刘维. 基于空间映射的蛋白质相互作用网络链接预测算法 Link Prediction Algorithm in Protein-Protein Interaction Network Based on Spatial Mapping 计算机科学, 2016, 43(Z6): 413-417. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.098 |
[7] | 洪海燕,刘维. 基于PPI网络的关键蛋白质的高效预测算法 Efficient Prediction Method of Essential Proteins Based on PPI Network 计算机科学, 2016, 43(Z11): 16-20. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.004 |
[8] | 赵碧海,熊慧军,倪问尹,刘志兵,胡赛. 一种改进的基于加权网络的蛋白质复合物识别算法 Improved Weighted-network Based Algorithm for Predicting Protein Complexes 计算机科学, 2014, 41(6): 231-234. https://doi.org/10.11896/j.issn.1002-137X.2014.06.045 |
[9] | 尤梦丽,雷秀娟. PPI网络聚类的评价方法的研究与应用 Study and Application of Evaluating Methods of PPI Network Clustering 计算机科学, 2013, 40(12): 254-258. |
|