Computer Science ›› 2019, Vol. 46 ›› Issue (4): 222-227.doi: 10.11896/j.issn.1002-137X.2019.04.035

Special Issue: Bioinformatics

• Artificial Intelligence • Previous Articles     Next Articles

Prediction of Protein Functions Based on Bi-weighted Vote

TANG Jia-qi1, WU Jing-li1,2,3, LIAO Yuan-xiu1, WANG Jin-yan1,2,3   

  1. School of Computer Science & Information Engineering,Guangxi Normal University,Guilin,Guangxi 541004,China1
    Guangxi Key Laboratory of Multi-Source Information Mining & Safety,Guangxi Normal University,Guilin,Guangxi 541004,China2
    Guangxi Regional Multi-Source Information Integration & Intelligent Processing Cooperation Innovation Center,Guilin,Guangxi 541004,China3
  • Received:2018-03-03 Online:2019-04-15 Published:2019-04-23

Abstract: Proteins are the essential molecules to accomplish important biological activities.It will greatly promote the advance of life science research and application to accurately grasp their functions.A tremendous amount of protein sequences has been generated with the development of high-throughput techniques.Thus,prediction of large-scale protein functions with computation technology has become one of the key tasks in bioinformatics today.Currently,the prediction method based on protein-protein interaction network,which is a research hotspot of protein function prediction,still has shortcomings at such aspects as reducing the impact of data noise,making full use of network topology characteristics,integrating multi-source data,and so on.In this paper,the Bi-Weighted Vote(BIWV) algorithm was proposed to predict protein functions,which combines the global topological similarity produced by Random Walk with Resistance (RWS) and the semantic similarity between terms.In addition,the Bi-Weighted Vote algorithm with pathway (BiWV-P) was presented by integrating the information of biological pathway.By using the data sets of saccharomyces cerevi-siae and homo sapiens,experiments were performed to compare TMC,UBiRW,ProHG,BiWV and BiWV-P.The experimental results indicate that BiWV algorithm and BiWV-P algorithm can predict protein functions effectively,and achieve higher micro-accuracy and micro-F1 than other algorithms in many data sets.

Key words: Biological pathway, Function prediction, Protein-protein interaction network, Random walk, Semantic similarity

CLC Number: 

  • TP391
[1]SCHWIKOWSKI B,UETZ P,FIELDS S.A network of protein-protein interactions in yeast[J].Nature Biotechnology,2000,18(12):1257-1261.
[2]HISHIGAKI H,NAKAI K,ONO T,et al.Assessment of prediction accuracy of protein function from protein-protein interaction data[J].Yeast,2001,18(6):523-531.
[3]CHUA H N,SUNG W K,WONG L.Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions[J].Bioinformatics,2006,22(13):1623-1630.
[4]CHRISTINE B,FRANÇOIS C,DAVID M,et al.Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network[J].Genome Biology,2003,5(1):6-18.
[5]NABIEVA E,JIM K,AGARWAL A,et al.Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps[J].Bioinformatics,2005,21(1):302-310.
[6]DENG M,TU Z,SUN F,et al.Mapping Gene Ontology to proteins based on protein-protein interaction data.[J].Bioinforma-tics,2004,20(6):895-902.
[7]VAZQUEZ A,FLAMMINI A,MARITAN A,et al.Global protein function prediction from protein-protein interaction,networks[J].Nature Biotechnology,2003,21(6):697-700.
[8]ZHANG X F,DAI D Q.A Framework for Incorporating Functional Interrelationships into Protein Function Prediction Algorithms[J].IEEE/ACM Transactions on Computational Biology &Bioinformatics,2012,9(3):740-753.
[9]WANG H,HUANG H,DING C.Function-Function Correlated Multi-Label Protein Function Prediction over Interaction Networks[C]∥International Conference on Research in Computational Molecular Biology.Berlin:Springer,2012:302-313.
[10]YU G,ZHU H,DOMENICONI C.Predicting protein functions using incomplete hierarchical labels[J].BMC Bioinformatics,2015,16(1):1-12.
[11]PENG W,WANG J,CHEN L,et al.Predicting protein functions by using unbalanced bi-random walk algorithm on protein-protein interaction network and functional interrelationship network[J].Current Protein & Peptide Science,2014,15(6):529-539.
[12]YU G,RANGWALA H,DOMENICONI C,et al.Protein Function Prediction using Multi-label Ensemble Classification[J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2013,10(4):1045-1057.
[13]LIU J,WANG J,YU G.Protein Function Prediction by Random Walks on a Hybrid Graph[J].Current Proteomics,2016,13(2):130-142.
[14]PRASAD A,SAHA S,CHATTERJEE P,et al.Protein Function Prediction from Protein Interaction Network Using Bottom-up L2L Apriori Algorithm[C]∥International Conference on Computational Intelligence,Communications,and Business Analytics.Singapore:Springer,2017:3-16.
[15]LICHTENBERG U D,JENSEN L J,BRUNAK S,et al.Dynamic Complex Formation During the Yeast Cell Cycle[J].Science,2005,307(5710):724-727.
[16]XIONG W,LIU H,GUAN J,et al.Protein function prediction by collective classification with explicit and implicit edges in protein-protein interaction networks[J].BMC Bioinformatics,2013,14(Suppl 12):4-16.
[17]COZZETTO D,BUCHAN D W,BRYSON K,et al.Protein function prediction by massive integration of evolutionary analyses and multiple data sources[J].BMC Bioinformatics,2013,14 (Suppl 3):1-11.
[18]CAO M,PIETRAS C M,FENG X,et al.New directions for diffusion-based network prediction of protein function:incorporating pathways with confidence[J].Bioinformatics,2014,30(12):219-227.
[19]PENG W,LI M,CHEN L,et al.Predicting protein functions by using unbalanced random walk algorithm on three biological networks[J].IEEE/ACM transactions on computational biology and bioinformatics,2017,14(2):360-369.
[20]LEI C,RUAN J.A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity[J].Bioinformatics,2013,29(3):355-364.
[21]WANG J Z,DU Z,PAYATTAKOOL R,et al.A new method to measure the semantic similarity of GO terms[J].Bioinforma-tics,2007,23(10):1274-1281.
[22]XENARIOS I,RICE D W,SALWINSKI L,et al.DIP:the database of interacting proteins.[J].Nucleic Acids Research,2000,32(1):289-291.
[23]OGATA H,GOTO S,SATO K,et al.KEGG:Kyoto Encyclopedia of Genes and Genomes.[J].Nucleic Acids Research,2000,27(1):29-34.
[24]ASHBURNER M,BALL C J,BOTSTEIN D,et al.Gene ontology:tool for the unification of biology.The Gene Ontology Consortium[J].Nature Genetics,2000,25(1):25-29.
[25]CARY M P,BADER G D,SANDER C.Pathway information for systems biology[J].FEBS Letters,2005,579(8):1815-1820.
[26]CONSORTIUM U P.The Universal Protein Resource (Uni- Prot) in 2010[J].Nucleic Acids Research,2010,38(Database issue):142-148.
[27]BIRNEY E.Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt[J].Nature Protocols,2009,4(8):1184-1191.
[28]TENENBAUM D.Client-side REST access to KEGG[EB/OL].http://rpackages.ianhowson.com/bioc/KEGGREST.
[29]ZHANG M L,ZHOU Z H.A Review on Multi-Label Learning Algorithms[J].IEEE Transactions on Knowledge & Data Engineering,2014,26(8):1819-1837.
[30]周志华.机器学习.北京:清华大学出版社,2016:23-33.
[31]GILLIS J,PAVLIDIS P.The Impact of Multifunctional Genes on “Guilt by Association” Analysis[J/OL].http://www.oalib.com/paper/134869@.W-vO7ywYxAs.
[1] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[2] LI Jia-wen, GUO Bing-hui, YANG Xiao-bo, ZHENG Zhi-ming. Disease Genes Recognition Based on Information Propagation [J]. Computer Science, 2022, 49(1): 264-270.
[3] LUO Yue-tong, WANG Tao, YANG Meng-nan, ZHANG Yan-kong. Historical Driving Track Set Based Visual Vehicle Behavior Analytic Method [J]. Computer Science, 2021, 48(9): 86-94.
[4] WANG Sheng, ZHANG Yang-sen, CHEN Ruo-yu, XIANG Ga. Text Matching Method Based on Fine-grained Difference Features [J]. Computer Science, 2021, 48(8): 60-65.
[5] LIU Dan, ZHAO Sen, YAN Zhi-liang, ZHAO Jing, WANG Hui-qing. miRNA-disease Association Prediction Model Based on Stacked Autoencoder [J]. Computer Science, 2021, 48(10): 114-120.
[6] LI Yang, LI Wei-gang, ZHAO Yun-tao, LIU Ao. Grey Wolf Algorithm Based on Levy Flight and Random Walk Strategy [J]. Computer Science, 2020, 47(8): 291-296.
[7] ZHANG Yun-fan,ZHOU Yu,HUANG Zhi-qiu. Semantic Similarity Based API Usage Pattern Recommendation [J]. Computer Science, 2020, 47(3): 34-40.
[8] ZHANG Hu, ZHOU Jing-jing, GAO Hai-hui, WANG Xin. Network Representation Learning Method on Fusing Node Structure and Content [J]. Computer Science, 2020, 47(12): 119-124.
[9] MA Xiao-hui, JIA Jun-zhi, ZHOU Xiang-zhen, YAN Jun-ya. Semantic Similarity-based Method for Sentiment Classification [J]. Computer Science, 2020, 47(11): 275-279.
[10] XU Fei-xiang,YE Xia,LI Lin-lin,CAO Jun-bo,WANG Xin. Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm [J]. Computer Science, 2020, 47(1): 199-204.
[11] ZHANG Xue-fu, ZENG Pan, JIN Min. Cancer Classification Prediction Model Based on Correlation and Similarity [J]. Computer Science, 2019, 46(7): 300-307.
[12] ZHAO Qian-qian, LV Min, XU Yin-long. Estimating Graphlets via Two Common Substructures Aware Sampling in Social Networks [J]. Computer Science, 2019, 46(3): 314-320.
[13] WANG Jie, LIANG Ji-ye, ZHAO Xing-wang, ZHENG Wen-ping. Overlapping Protein Complexes Detection Algorithm Based on Assortativity in PPI Networks [J]. Computer Science, 2019, 46(2): 294-300.
[14] LIU Yuan, WANG XinGAN Ying, YANG Chao-zhouLI Wei-xi. BioPW+:Biological Pathway Data Visualization System Based on Linked Data [J]. Computer Science, 2019, 46(2): 18-23.
[15] YIN Xin-hong, ZHAO Shi-yan, CHEN Xiao-yun. Community Detection Algorithm Based on Random Walk of Signal Propagation with Bias [J]. Computer Science, 2019, 46(12): 45-55.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!