计算机科学 ›› 2019, Vol. 46 ›› Issue (12): 313-321.doi: 10.11896/jsjkx.181102215
陈征, 田博, 何增有
CHEN Zheng, TIAN Bo, HE Zeng-you
摘要: 随着蛋白质组学的发展,研究者们开始聚焦于人类的全部蛋白质相互作用(Protein-Protein Interaction,PPI)网络的建立,质谱分析技术已成为预测蛋白质相互作用的代表方法。质谱技术是构建蛋白质相互作用网络的主要实验手段之一,基于质谱技术产生了大量的蛋白质纯化数据,如AP-MS数据和PCP-MS数据等。这些数据为PPI网络的构建提供了重要的数据支持,但是通过人工的手段来构建PPI网络不仅低效,而且很不现实。因此,面向PCP-MS数据的网络推断算法是生物信息学研究的一个热点问题。文中针对一类主流的质谱(PCP-MS)数据的PPI网络构建算法问题开展研究,从解决目前存在的瓶颈问题出发,达到构建高质量PPI网络的目的。现有的面向PCP-MS数据的PPI网络推断算法的研究还处于初级阶段,相关方法较少。同时,算法结果的质量还存在着一些问题:1)很多错误的相互作用被包含在不同的推断算法结果中,同时一些正确的相互作用在结果中被遗漏;2)不同的推断算法在同一数据集上的表现差异较大;3)对于不同的数据集,同一算法表现性能的波动方差较大。因此,为了从PCP-MS数据中推断出结构可靠、质量较高的PPI网络,文中提出一种基于相关性分析与排序整合的PPI评分方法。该方法基于无监督学习,包括以下两个步骤:1)计算蛋白质之间的相关系数,得到多组相关性结果;2)采用排序整合的方法对多组结果进行整合,得到整合后的PPI分数。实验结果表明,所提方法在不使用参考标准的情况下,可以达到与有监督学习方法接近的结果。
中图分类号:
[1]GUAN W,WANG J,HE F C.The advance in research methods for large-scale protein-protein interactions [J].Chinese Bulletin of Life Sciences,2006,18(5):507-512.(in Chinese) 关薇,王建,贺福初.大规模蛋白质相互作用研究方法进展[J].生命科学,2006,18(5):507-512.[2]KIM M S,PINTO S M,GETNET D,et al.A draft map of the human proteome [J].Nature,2014,509(7502):575-581.[3]WILHELM M,SCHLEGL J,HAHNE H,et al.Mass-spec- trometry-based draft of the human proteome [J].Nature,2014,509(7502):582-587.[4]BAKER M.Proteomics:The interaction map [J].Nature,2012,484(7393):271-275.[5]MIRZAEI H,CARRASCO M.Modern Proteomics-Sample Preparation,Analysis and Practical Applications[M].Springer International Publishing,2016.[6]MEHTA V,TRINKLE-MULCAHY L.Recent advances in large-scale protein interactome mapping[J].F1000research,2016,5:782.[7]FAN S B,WU Y J,YANG B,et al.A New Approach to Protein Structure and Interaction Research:Chemical Cross-linking in Combination With Mass Spectrometry [J].Progress in Bioche-mistry and Biophysics,2014,41(11):1109-1125.(in Chinese) 樊盛博,吴妍洁,杨兵,等.蛋白质结构与相互作用研究新方法——交联质谱技术[J].生物化学与生物物理进展,2014,41(11):1109-1125.[8]HUTTLIN E L,TING L,BRUCKNER R J,et al.The BioPlex Network:A Systematic Exploration of the Human Interactome.[J].Cell,2015,162(2):425-440.[9]HUTTLIN E L,BRUCKNER R J,PAULO J A,et al.Architecture of the human interactome defines protein communities and disease networks:[J].Nature,2017,545(7655):505-509.[10]BEHRENDS C,SOWA M E,GYGI S P,et al.Network organization of the human autophagy system[J].Nature,2010,466(7302):68-76.[11]JÄGER S,CIMERMANCIC P,GULBAHCE N,et al.Global landscape of HIV-human protein complexes [J].Nature,2012,481(7381):365-370.[12]SOWA M E,BENNETT E J,GYGI S P,et al.Defining the human deubiquitinating enzyme interaction landscape [J].Cell,2009,138(2):389-403.[13]GURUHARSHA K G,RUAL J F,ZHAI B,et al.A protein complex network of Drosophila melanogaster [J].Cell,2011,147(3):690-703.[14]TENG B,ZHAO C,LIU X,et al.Network inference from AP-MS data:computational challenges and solutions [J].Briefings in Bioinformatics,2015,16(4):658-674.[15]CHEN B,FAN W,LIU J,et al.Identifying protein complexes and functional modules—from static PPI networks to dynamic PPI networks [J].Briefings in Bioinformatics,2014,15(2):177-194.[16]JI J,ZHANG A,LIU C,et al.Survey:Functional Module Detection from Protein-Protein Interaction Networks [J].IEEE Transactions on Knowledge & Data Engineering,2014,26(2):261-277.[17]VARJOSALO M,SACCO R,STUKALOV A,et al.Interlaboratory reproducibility of large-scale human protein-complex analysis by standardized AP-MS [J].Nature Methods,2013,10(4):307-314.[18]SHARAN R,ULITSKY I,SHAMIR R.Network-based prediction of protein function [J].Molecular Systems Biology,2007,3(1):88.[19]BARABÁSI A L,GULBAHCE N,LOSCALZO J.Network medicine:a network-based approach to human disease [J].Nature Reviews Genetics,2011,12(1):56-68.[20]TAYLOR I W,LINDING R,WARDE-FARLEY D,et al.Dynamic modularity in protein interaction networks predicts breast cancer outcome [J].Nature Biotechnology,2009,27(2):199-204.[21]HE Z,YU W.Stable feature selection for biomarker discovery [J].Computational Biology and Chemistry,2010,34(4):215-225.[22]NESVIZHSKII A I.Computational and informatics strategies for identification of specific protein interaction partners in affinity purification mass spectrometry experiments [J].Proteomics,2012,12(10):1639-1655.[23]ARMEAN I M,LILLEY K S,TROTTER M W B.Popular computational methods to assess multiprotein complexes derived from label-free affinity purification and mass spectrometry (AP-MS) experiments [J].Molecular & Cellular Proteomics,2013,12(1):1-13.[24]WAN C,BORGESON B,PHANSE S,et al.Panorama of ancient metazoan macromolecular complexes [J].Nature,2015,525(7569):339-344.[25]HAVUGIMANA P C,HART G T,NEPUSZ T,et al.A census of human soluble protein complexes [J].Cell,2012,150(5):1068-1081.[26]DE GELDER R,WEHRENS R,HAGEMAN J A.A generalized expression for the similarity of spectra:application to powder diffraction pattern classification [J].Journal of Computational Chemistry,2001,22(3):273-289 [27]TIAN B,DUAN Q,ZHAO C,et al.Reinforce:An Ensemble Approach for Inferring PPI Network from AP-MS Data [J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2017,PP(99):1-1.[28]KOLDE R,LAUR S,ADLER P,et al.Robust rank aggregation for gene list integration and meta-analysis [J].Bioinformatics,2012,28(4):573-580.[29]STOREY J D.A direct approach to false discovery rates [J]. Journal of the Royal Statistical Society,2002,64(3):479-498.[30]RUEPP A,WAEGELE B,LECHNER M,et al.CORUM:the comprehensive resource of mammalian protein complexes—2009 [J].Nucleic Acids Research,2009,38(suppl_1):D497-D501. |
[1] | 杨啸, 王翔坤, 胡浩, 朱敏. 面向设备状态监测的可视化技术综述 Survey on Visualization Technology for Equipment Condition Monitoring 计算机科学, 2022, 49(7): 89-99. https://doi.org/10.11896/jsjkx.210900167 |
[2] | 陈莉莉, 朱峰, 盛斌, 陈志华. 基于离散四元数傅里叶变换的彩色图像质量评价 Quality Evaluation of Color Image Based on Discrete Quaternion Fourier Transform 计算机科学, 2018, 45(8): 70-74. https://doi.org/10.11896/j.issn.1002-137X.2018.08.012 |
[3] | 胡庆生 雷秀娟. PPI网络的改进马尔科夫聚类算法 Improved MCL Clustering Algorithm in PPI Networks 计算机科学, 2015, 42(7): 108-113. https://doi.org/10.11896/j.issn.1002-137X.2015.07.023 |
[4] | 章月阳,刘维. 不确定性PPI网络链接预测 Link Prediction in Uncertain Protein-Protein Interaction Network 计算机科学, 2014, 41(Z11): 399-402. |
[5] | 赵美惠. 面向环境监测的无线传感器网络的数据流挖掘研究 Study on Mining Data Streams in WSNs for Environment Monitoring 计算机科学, 2012, 39(Z11): 111-113. |
[6] | 陈 红,刘光远,赖祥伟. 相关性分析和最大最小蚁群算法用于脉搏信号的情感识别 Affective Recognition from Pulse Signal Using Correlation Analysis and Max-Min Ant Colony Algorithm 计算机科学, 2012, 39(4): 250-253. |
[7] | 曹军,刘光远,赖祥伟. 量子粒子群和相关性分析在心电特征选择中的应用 Application of QPSO Algorithm and Correlation Analysis in Feature Selection from ECG Signal 计算机科学, 2012, 39(3): 212-215. |
|