计算机科学 ›› 2019, Vol. 46 ›› Issue (4): 222-227.doi: 10.11896/j.issn.1002-137X.2019.04.035
所属专题: 生物信息学
唐家琪1, 吴璟莉1,2,3, 廖元秀1, 王金艳1,2,3
TANG Jia-qi1, WU Jing-li1,2,3, LIAO Yuan-xiu1, WANG Jin-yan1,2,3
摘要: 蛋白质是完成重要生物活动所必需的分子。准确掌握蛋白质功能,将对生命科学研究及应用起到极大的促进作用。高通量技术的发展产生了海量的蛋白质序列,利用计算技术预测大规模蛋白质功能已成为当今生物信息学的核心任务之一。目前,作为蛋白质功能预测的研究热点,基于蛋白质相互作用网络的预测方法在降低数据噪声影响、充分利用网络拓扑特性及整合多源数据等方面仍不够完善。文中结合带阻力随机游走得到的全局拓扑相似度,及功能术语的语义相似度,设计了一种双加权投票蛋白质功能预测算法BiWV;并在此基础上整合了生物通路信息,提出了带生物通路的双加权投票算法——BiWV-P。在酿酒酵母和人类数据集上,对所提算法与TMC,UBiRW和ProHG 3种算法的预测效果进行对比分析。实验结果显示,算法BiWV和BiWV-P能够有效预测蛋白质功能,并在许多数据集上获得较其他算法更高的微正确率与微F1。
中图分类号:
[1]SCHWIKOWSKI B,UETZ P,FIELDS S.A network of protein-protein interactions in yeast[J].Nature Biotechnology,2000,18(12):1257-1261. [2]HISHIGAKI H,NAKAI K,ONO T,et al.Assessment of prediction accuracy of protein function from protein-protein interaction data[J].Yeast,2001,18(6):523-531. [3]CHUA H N,SUNG W K,WONG L.Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions[J].Bioinformatics,2006,22(13):1623-1630. [4]CHRISTINE B,FRANÇOIS C,DAVID M,et al.Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network[J].Genome Biology,2003,5(1):6-18. [5]NABIEVA E,JIM K,AGARWAL A,et al.Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps[J].Bioinformatics,2005,21(1):302-310. [6]DENG M,TU Z,SUN F,et al.Mapping Gene Ontology to proteins based on protein-protein interaction data.[J].Bioinforma-tics,2004,20(6):895-902. [7]VAZQUEZ A,FLAMMINI A,MARITAN A,et al.Global protein function prediction from protein-protein interaction,networks[J].Nature Biotechnology,2003,21(6):697-700. [8]ZHANG X F,DAI D Q.A Framework for Incorporating Functional Interrelationships into Protein Function Prediction Algorithms[J].IEEE/ACM Transactions on Computational Biology &Bioinformatics,2012,9(3):740-753. [9]WANG H,HUANG H,DING C.Function-Function Correlated Multi-Label Protein Function Prediction over Interaction Networks[C]∥International Conference on Research in Computational Molecular Biology.Berlin:Springer,2012:302-313. [10]YU G,ZHU H,DOMENICONI C.Predicting protein functions using incomplete hierarchical labels[J].BMC Bioinformatics,2015,16(1):1-12. [11]PENG W,WANG J,CHEN L,et al.Predicting protein functions by using unbalanced bi-random walk algorithm on protein-protein interaction network and functional interrelationship network[J].Current Protein & Peptide Science,2014,15(6):529-539. [12]YU G,RANGWALA H,DOMENICONI C,et al.Protein Function Prediction using Multi-label Ensemble Classification[J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2013,10(4):1045-1057. [13]LIU J,WANG J,YU G.Protein Function Prediction by Random Walks on a Hybrid Graph[J].Current Proteomics,2016,13(2):130-142. [14]PRASAD A,SAHA S,CHATTERJEE P,et al.Protein Function Prediction from Protein Interaction Network Using Bottom-up L2L Apriori Algorithm[C]∥International Conference on Computational Intelligence,Communications,and Business Analytics.Singapore:Springer,2017:3-16. [15]LICHTENBERG U D,JENSEN L J,BRUNAK S,et al.Dynamic Complex Formation During the Yeast Cell Cycle[J].Science,2005,307(5710):724-727. [16]XIONG W,LIU H,GUAN J,et al.Protein function prediction by collective classification with explicit and implicit edges in protein-protein interaction networks[J].BMC Bioinformatics,2013,14(Suppl 12):4-16. [17]COZZETTO D,BUCHAN D W,BRYSON K,et al.Protein function prediction by massive integration of evolutionary analyses and multiple data sources[J].BMC Bioinformatics,2013,14 (Suppl 3):1-11. [18]CAO M,PIETRAS C M,FENG X,et al.New directions for diffusion-based network prediction of protein function:incorporating pathways with confidence[J].Bioinformatics,2014,30(12):219-227. [19]PENG W,LI M,CHEN L,et al.Predicting protein functions by using unbalanced random walk algorithm on three biological networks[J].IEEE/ACM transactions on computational biology and bioinformatics,2017,14(2):360-369. [20]LEI C,RUAN J.A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity[J].Bioinformatics,2013,29(3):355-364. [21]WANG J Z,DU Z,PAYATTAKOOL R,et al.A new method to measure the semantic similarity of GO terms[J].Bioinforma-tics,2007,23(10):1274-1281. [22]XENARIOS I,RICE D W,SALWINSKI L,et al.DIP:the database of interacting proteins.[J].Nucleic Acids Research,2000,32(1):289-291. [23]OGATA H,GOTO S,SATO K,et al.KEGG:Kyoto Encyclopedia of Genes and Genomes.[J].Nucleic Acids Research,2000,27(1):29-34. [24]ASHBURNER M,BALL C J,BOTSTEIN D,et al.Gene ontology:tool for the unification of biology.The Gene Ontology Consortium[J].Nature Genetics,2000,25(1):25-29. [25]CARY M P,BADER G D,SANDER C.Pathway information for systems biology[J].FEBS Letters,2005,579(8):1815-1820. [26]CONSORTIUM U P.The Universal Protein Resource (Uni- Prot) in 2010[J].Nucleic Acids Research,2010,38(Database issue):142-148. [27]BIRNEY E.Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt[J].Nature Protocols,2009,4(8):1184-1191. [28]TENENBAUM D.Client-side REST access to KEGG[EB/OL].http://rpackages.ianhowson.com/bioc/KEGGREST. [29]ZHANG M L,ZHOU Z H.A Review on Multi-Label Learning Algorithms[J].IEEE Transactions on Knowledge & Data Engineering,2014,26(8):1819-1837. [30]周志华.机器学习.北京:清华大学出版社,2016:23-33. [31]GILLIS J,PAVLIDIS P.The Impact of Multifunctional Genes on “Guilt by Association” Analysis[J/OL].http://www.oalib.com/paper/134869@.W-vO7ywYxAs. |
[1] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[2] | 李家文, 郭炳晖, 杨小博, 郑志明. 基于信息传播的致病基因识别研究 Disease Genes Recognition Based on Information Propagation 计算机科学, 2022, 49(1): 264-270. https://doi.org/10.11896/jsjkx.201100129 |
[3] | 王胜, 张仰森, 陈若愚, 向尕. 基于细粒度差异特征的文本匹配方法 Text Matching Method Based on Fine-grained Difference Features 计算机科学, 2021, 48(8): 60-65. https://doi.org/10.11896/jsjkx.200700008 |
[4] | 刘丹, 赵森, 颜志良, 赵静, 王会青. 基于堆叠自动编码器的miRNA-疾病关联预测方法 miRNA-disease Association Prediction Model Based on Stacked Autoencoder 计算机科学, 2021, 48(10): 114-120. https://doi.org/10.11896/jsjkx.200900169 |
[5] | 戴彩艳, 何菊, 胡孔法, 丁有伟, 李新霞. 基于衰减系数建立动态蛋白质网络模型进行关键蛋白质预测 Establishment of Dynamic Protein Network Model Based on Attenuation Coefficient for Key Protein Prediction 计算机科学, 2020, 47(6A): 29-33. https://doi.org/10.11896/JsJkx.190800071 |
[6] | 张云帆,周宇,黄志球. 基于语义相似度的API使用模式推荐 Semantic Similarity Based API Usage Pattern Recommendation 计算机科学, 2020, 47(3): 34-40. https://doi.org/10.11896/jsjkx.190300053 |
[7] | 张虎, 周晶晶, 高海慧, 王鑫. 融合节点结构和内容的网络表示学习方法 Network Representation Learning Method on Fusing Node Structure and Content 计算机科学, 2020, 47(12): 119-124. https://doi.org/10.11896/jsjkx.190900027 |
[8] | 杨壮, 刘培强, 费兆杰, 刘畅. 基于结构洞的多数据源融合关键蛋白质识别方法 Essential Protein Identification Method Based on Structural Holes and Fusion of Multiple Data Sources 计算机科学, 2020, 47(11A): 40-45. https://doi.org/10.11896/jsjkx.200200004 |
[9] | 许飞翔,叶霞,李琳琳,曹军博,王馨. 基于SA-BP算法的本体概念语义相似度综合计算 Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm 计算机科学, 2020, 47(1): 199-204. https://doi.org/10.11896/jsjkx.181202351 |
[10] | 赵倩倩,吕敏,许胤龙. 基于两种子结构感知的社交网络Graphlets采样估计算法 Estimating Graphlets via Two Common Substructures Aware Sampling in Social Networks 计算机科学, 2019, 46(3): 314-320. https://doi.org/10.11896/j.issn.1002-137X.2019.03.046 |
[11] | 尹欣红, 赵世燕, 陈晓云. 带偏置的信号传播的随机游走的社团检测算法 Community Detection Algorithm Based on Random Walk of Signal Propagation with Bias 计算机科学, 2019, 46(12): 45-55. https://doi.org/10.11896/jsjkx.190700051 |
[12] | 杨开平, 李明奇, 覃思义. 基于网络回复的律师评价方法 Lawyer Evaluation Method Based on Network Response 计算机科学, 2018, 45(9): 237-242. https://doi.org/10.11896/j.issn.1002-137X.2018.09.039 |
[13] | 刘庆烽, 刘哲, 宋余庆, 朱彦. 基于约束随机游走的肿瘤图像分割方法 Tumor Image Segmentation Method Based on Random Walk with Constraint 计算机科学, 2018, 45(7): 243-247. https://doi.org/10.11896/j.issn.1002-137X.2018.07.042 |
[14] | 肖迎元,张红玉. 基于用户潜在特征的社交网络好友推荐方法 Friend Recommendation Method Based on Users’ Latent Features in Social Networks 计算机科学, 2018, 45(3): 218-222. https://doi.org/10.11896/j.issn.1002-137X.2018.03.034 |
[15] | 卿勇,刘梦娟,银盈,李杨曦. SMART:一种面向电商平台快速消费品的图推荐算法 SMART:A Graph-based Recommendation Algorithm for Fast Moving Consumer Goods in E-commerce Platform 计算机科学, 2017, 44(Z11): 464-469. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.099 |
|