Computer Science ›› 2020, Vol. 47 ›› Issue (11A): 40-45.doi: 10.11896/jsjkx.200200004

• Artificial Intelligence • Previous Articles     Next Articles

Essential Protein Identification Method Based on Structural Holes and Fusion of Multiple Data Sources

YANG Zhuang, LIU Pei-qiang, FEI Zhao-jie, LIU Chang   

  1. School of Computer Science and Technology,Shandong Technology and Business University,Yantai,Shandong 264005,China
    Co-innovation Center of Shandong Colleges and Universities:Future Intelligent Computing,Yantai,Shandong 264005,China
  • Online:2020-11-15 Published:2020-11-17
  • About author:YANG Zhuang,born in 1992,MS.His main research interests include algorithms and complexity theory,and computational biology.
    LIU Pei-qiang,born in 1970,Ph.D,professor,is a member of China Computer Federation.His main research interests include algorithms and complexity theory,and computational biology.
  • Supported by:
    This work was supported by the Shandong Provincial Natural Science Foundation(ZR2017MF049) and Key Research and Development Program of Yantai City(2017ZH065).

Abstract: Essential protein identification is a hot research topic which is difficult in the field of computational biology.The exis-ting methods for identifying essential proteins by computational methods are mainly DC,BC,LAC,PeC,ION,and LIDC,yet the identification accuracy needs to be further improved,mainly because only one data source is used which is protein interaction network,and there are many false positive and false negative data in the network.In order to improve the identification accuracy,an efficient essential protein identification method PSHC is proposed.Firstly,the PSHC method introduced the structure hole theory into the essential protein identification method for the first time.Secondly,the PSHC method combines two data sources of protein interaction network and protein complex to identify the essential proteins.Experimental results on real data show that PSHC can identify more essential proteins than other traditional methods,and statistical indicators such as sensitivity,specificity,accuracy,positive predictive value,negative predictive value,and F-measure are also higher than other methods.

Key words: Essential proteins, Protein complex, Protein interaction network, Structural holes

CLC Number: 

  • TP301
[1] PÁL C,PAPP B.Genomic function:Rate of evolution and gene dispensability[J].Nature,2003,421(6922):496-497.
[2] CLATWORTHY A E,PIERSON E,HUNG D T.Targeting viru-lence:a new paradigm for antimicrobial therapy[J].Nat Chem Biol,2007,3(9):541-548.
[3] LAMICHHANE G,ZIGNOL M,BLADES N J.A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis:Application to Mycobacterium tuberculosis[J].PNAS,2003,100(12):7213-7218.
[4] STEINMETZ L M,SCHARFE C,DEUTSCHBAUER A M,et al.Systematic screen for human disease genes in yeast[J].Nat Genet,2002,31(4):400-404.
[5] GIAEVER G,CHU A M,LI N.Functional profiling of theSaccharomyces cerevisiae genome[J].Nature,2002,418(6896):387.
[6] ROEMER T,JIANG B,DAVISON J,et al.Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery[J].Mol Microbiol,2003,50(1):167-181.
[7] CULLEN L M,ARNDT G M.Genome-wide screening for gene function using RNAi in mammalian cells[J].Immunol Cell Biol,2005,83(3):217-223.
[8] ITO T,CHIBA T,OZAWA R,et al.A comprehensive two-hybrid analysis to explore the yeast protein interactome[J].Proceedings of the National academy of Sciences of the United States of America,2001,98(8):4569-4574.
[9] AEBERSOLD R,MANN M.Mass spectrometry-based pro-teomics[J].Nature,2003,422(6928):198-207.
[10] HO Y,GRUHLER A,BADER G D,et al.Systematic identi©cation of protein complexes in Saccharomyces cerevisiae by mass spectrometry[J].Nature,2002,415(6868):180-183.
[11] H J,SP M,AL B.Lethality and centrality in protein networks[J].Nature,2001,411(6833):41-42.
[12] HAHN M W,KERN A D.Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks[J].Mol Biol Evol,2005,22(4):803-806.
[13] JOY M P,BROCK A,INGBER D E,et al.High-betweennessproteins in the yeast protein interaction network[J].J Biomed Biotechnol,2005,2005(2):96-103.
[14] ESTRADA E,RODRIGUEZ-VELAZQUEZ J A.Subgraph centrality in complex networks[J].Phys Rev E Stat Nonlin Soft Matter Phys,2005,71(5 Pt 2):056103.
[15] WUCHTY S,STADLER P F.Centers of complex networks[J].Journal of Theoretical Biology,2003,223(1):45-53.
[16] STEPHENSON K,ZELEN M.Rethinking centrality:Methods and examples[J].Social Networks,1989,11(1):1-37.
[17] BONACICH P.Power and Centrality:A Family of Measures[J].American Journal of Sociology,1987,92(5):1170-1182.
[18] LI M,WANG J,CHEN X,et al.A local average connectivity-based method for identifying essential proteins from the network level[J].Comput Biol Chem,2011,35(3):143-150.
[19] WANG J,LI M,WANG H,et al.Identification of essential proteins based on edge clustering coefficient[J].IEEE/ACM Trans Comput Biol Bioinform,2012,9(4):1070-1080.
[20] QI Y,LUO J.Prediction of Essential Proteins Based on Local Interaction Density[J].IEEE/ACM Trans Comput Biol Bioinform,2016,13(6):1170-1182.
[21] LI M,ZHANG H,WANG J X,et al.A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data[J].Bmc Systems Biology,2012,6(1):15.
[22] ZHAO B,ZHAO Y,ZHANG X,et al.An iteration method foridentifying yeast essential proteins from heterogeneous network[J].BMC Bioinformatics,2019,20(1):355-368.
[23] LUO J,QI Y.Identification of Essential Proteins Based on aNew Combination of Local Interaction Density and Protein Complexes[J].PLoS One,2015,10(6):e0131418.
[24] QIN C,SUN Y,DONG Y.A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes[J].PLoS One,2016,11(8):e0161042.
[25] ZHANG X,XIAO W,HU X.Predicting essential proteins by integrating orthology,gene expressions,and PPI networks[J].PLoS One,2018,13(4):e0195410.
[26] LI M,LU Y,NIU Z,et al.United Complex Centrality for Identification of Essential Proteins from PPI Networks[J].IEEE/ACM Trans Comput Biol Bioinform,2017,14(2):370-380.
[27] LEI X,YANG X.A new method for predicting essential proteins based on participation degree in protein complex and subgraph density[J].PLoS One,2018,13(6):e0198998.
[28] LI M,LI W,WU F X,et al.Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information[J].Journal of Theoretical Biology,2018,447:65-73.
[29] LEI X,YANG X,FUJITA H.Random walk based method to identify essential proteins by integrating network topology and biological characteristics[J].Knowledge-Based Systems,2019,167:53-67.
[30] BURT R S.Structural Holes:The Social Structure of Competition[M].Harvard University Press,2009.
[31] IOANNIS X,LUKASZ S,JOYCE D X.DIP,the database of interacting proteins:a research tool for studying cellular networks of protein interactions[J].Nucleic Acids Research,2002,30(1):303-305.
[32] KROGAN N J,CAGNEY G,YU H,et al.Global landscape of protein complexes in the yeast Saccharomyces cerevisiae[J].Nature,2006,440(7084):637-643.
[33] MEWES H W,DIETMANN S,FRISHMAN D,et al.MIPS:analysis and annotation of genome information in 2007[J].NucleicAcids Res,2008,36:196-201.
[34] MICHAEL C J,CAROLINE A,CATHERINE B.SGD:saccharomyces genome database[J].Nucleic Acids Research,1998,26(1):73-79.
[35] ZHANG R,LIN Y.DEG 5.0,a database of essential genes in both prokaryotes and eukaryotes[J].Nucleic Acids Res,2009,37:455-458.
[36] DE MATTEIS G,GRAUDENZI A,ANTONIOTTI M.A review of spatial computational models for multi-cellular systems,with regard to intestinal crypts and colorectal cancer development[J].J Math Biol,2013,66(7):1409-1462.
[1] LIU Wen-yang, GUO Yan-bu, LI Wei-hua. Identifying Essential Proteins by Hybrid Deep Learning Model [J]. Computer Science, 2021, 48(8): 240-245.
[2] DAI Cai-yan, HE Ju, HU Kong-fa, DING You-wei and LI Xin-xia. Establishment of Dynamic Protein Network Model Based on Attenuation Coefficient for Key Protein Prediction [J]. Computer Science, 2020, 47(6A): 29-33.
[3] TANG Jia-qi, WU Jing-li, LIAO Yuan-xiu, WANG Jin-yan. Prediction of Protein Functions Based on Bi-weighted Vote [J]. Computer Science, 2019, 46(4): 222-227.
[4] WANG Jie, LIANG Ji-ye, ZHAO Xing-wang, ZHENG Wen-ping. Overlapping Protein Complexes Detection Algorithm Based on Assortativity in PPI Networks [J]. Computer Science, 2019, 46(2): 294-300.
[5] WANG Zhen, HAN Zhong-ming and LI Jin. Research on Social Network Structural Holes Discovery Algorithm under Large-scale Data [J]. Computer Science, 2017, 44(4): 188-192.
[6] ZHANG Yue-yang and LIU Wei. Link Prediction in Uncertain Protein-Protein Interaction Network [J]. Computer Science, 2014, 41(Z11): 399-402.
[7] ZHAO Bi-hai,XIONG Hui-jun,NI Wen-yin,LIU Zhi-bing and HU Sai. Improved Weighted-network Based Algorithm for Predicting Protein Complexes [J]. Computer Science, 2014, 41(6): 231-234.
[8] . PPI Networks Clustering Model and Algorithm Combining with the Principle of Artificial Fish School [J]. Computer Science, 2012, 39(7): 205-209.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!