计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241000145-9.doi: 10.11896/jsjkx.241000145

• 人工智能 • 上一篇    下一篇

PPIS-MFH:集成ViT的多特征混合网络预测蛋白质相互作用位点

胡昭龙, 胡春玲, 胡瑞捷, 郭龙菊   

  1. 合肥大学人工智能与大数据学院 合肥 230601
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 胡春玲(huchunling@hfuu.edu.cn)
  • 作者简介:sebant@163.com
  • 基金资助:
    国家自然科学基金面上项目:面向动态知识图谱的局部图表示学习研究(62306100)

PPIS-MFH:Predicting Protein-Protein Interaction Sites Based on Multi-feature HybridNetwork Integrating ViT

HU Zhaolong, HU Chunling, HU Ruijie, GUO Longju   

  1. School of Artificial Intelligence and Big Data,Hefei University,Hefei 230601,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China:Research on Local Graph Representation Learning for Dynamic Knowledge Graph(62306100).

摘要: 通过深入研究蛋白质-蛋白质相互作用位点(PPIS),能够揭示生命在分子层面运作的深层原理。然而现有方法鉴定PPIS复杂且耗时,需要更精确的模型进行PPIS预测。尽管基于注意力机制和卷积神经网络(CNN)的深度学习方法在PPIS预测方面取得了进展,但在氨基酸特性表征上仍存在局限。为了有效捕捉蛋白质序列中远距离的依赖关系,并准确地表征氨基酸的特性,提出了一种用于预测蛋白质-蛋白质相互作用位点的多特征混合网络(Multi-feature hybrid networks)——PPIS-MFH,通过结合全局序列特征与局部序列特征对PPIS进行预测。对于局部序列特征,PPIS-MFH模型融合了Vision Transformer(ViT)模块,该模块能够捕获蛋白质序列中的远距离依赖性,并提取局部特征。对于全局序列特征,模型PPIS-MFH通过由文本卷积神经网络(TextCNN)并引入注意力机制的文本循环神经网络(TextRNN-Attention)构成的特征交叉网络,利用双向门控循环单元网络来识别蛋白质序列中氨基酸间的内在联系。在4个数据集上对PPIS-MFH模型进行了评估,将其与8种同类方法进行了比较。实验结果显示在大多数指标上,所提方法优于其他的同类方法。

关键词: 蛋白质-蛋白质相互作用位点, 注意力机制, 文本卷积神经网络, 双向门控循环单元网络, 特征交叉网络

Abstract: The deeper principles of molecular life can be revealed through an in-depth study of protein-protein interaction sites(PPIS).However,existing methods for identifying PPIS are complex and time-consuming,and more accurate models are needed for PPIS prediction.Although deep learning techniques based on attention mechanisms and convolutional neural networks(CNNs) have made progress in PPIS prediction,they still face limitations in capturing amino acid features.To effectively capture long-range dependencies in protein sequences and accurately characterize amino acid properties,this paper proposes a multi-feature hybrid network(MFH),PPIS-MFH,for predicting protein-protein interaction sites.Protein-protein interaction sites are predicted by combining both global and local sequence features.For local sequence features,the PPIS-MFH model incorporates a Vision Transformer(ViT) module,which captures long-range dependencies and extracts local features from protein sequences.For global sequence features,the model employs a bidirectional gated recurrent neural network to discern intrinsic connections between amino acids in protein sequences.This is achieved through a feature crossover network that combines a text convolutional neural network(TextCNN) with an attention mechanism,specifically a text recurrent neural network(TextRNN-Attention).In this study,the PPIS-MFH model was evaluated on four datasets and compared with eight similar methods.The experimental results show that,on most metrics,the proposed method outperforms other similar methods.

Key words: Protein-protein interaction site, Attention mechanism, Text convolutional neural network, Bidirectional gated recurrent neural network, Feature crosses network

中图分类号: 

  • TP391
[1]DAS S,CHAKRABARTI S.Classification and prediction of protein-protein interaction interface using machine learning algorithm [J].Scientific Reports,2021,11(1):1761.
[2]BUTLAND G,PEREGRÍN-ALVAREZ J M,LI J,et al.Interaction network containing conserved and essential protein complexes in Escherichia coli [J].Nature,2005,433(7025):531-537.
[3]LI X,LI W,ZENG M,et al.Network-based methods for predicting essential genes or proteins:a survey [J].Briefings in Bioinformatics,2020,21(2):566-583.
[4]DE LAS RIVAS J,FONTANILLO C.Protein-protein interac-tions essentials:key concepts to building and analyzing interactome networks [J].PLoS Computational Biology,2010,6(6):e1000807.
[5]BRETTNER L M,MASEL J.Protein stickiness,rather thannumber of functional protein-protein interactions,predicts expression noise and plasticity in yeast [J].BMC Systems Biology,2012,6:1-10.
[6]TERENTIEV A A,MOLDOGAZIEVA N T,SHAITAN K V.Dynamic proteomics in modeling of the living cell.Protein-protein interactions [J].Biochemistry(Moscow),2009,74:1586-1607.
[7]WODAK S J,VLASBLOM J,TURINSKY A L,et al.Protein-protein interaction networks:the puzzling riches [J].Current Opinion in Structural Biology,2013,23(6):941-953.
[8]LI Y,GOLDING G B,ILIE L.DELPHI:accurate deep ensemble model for protein interaction sites prediction [J].Bioinformatics,2021,37(7):896-904.
[9]HOU Q,DE GEEST P F G,VRANKEN W F,et al.Seeing thetrees through the forest:sequence-based homo-and heteromeric protein-protein interaction sites prediction using random forest [J].Bioinformatics,2017,33(10):1479-1487.
[10]HOU Q,LENSINK M F,HERINGA J,et al.Club-martini:se-lectingfavourable interactions amongst available candidates,a coarse-grained simulation approach to scoring docking decoys [J].PloS One,2016,11(5):e0155251.
[11]ZHOU Y,JIANG Y,YANG Y.AGAT-PPIS:A novel protein-protein interaction site predictor based on augmented graph attention network with initial residual and identity mapping [J].Briefings in Bioinformatics,2023,24(3):bbad122.
[12]PITRE S,DEHNE F,CHAN A,et al.PIPE:a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs [J].BMC Bioinformatics,2006,7:1-15.
[13]OFRAN Y,ROST B.Predicted protein-protein interaction sites from local sequence information [J].FEBS Letters,2003,544(1/2/3):236-239.
[14]MURAKAMI Y,MIZUGUCHI K.Applying the Naïve Bayesclassifier with kernel density estimation to the prediction of protein-protein interaction sites [J].Bioinformatics,2010,26(15):1841-1848.
[15]YOUSEF A,CHARKARI N M.A novel methodbased on new adaptive LVQ neural network for predicting protein-protein interactions from protein sequences [J].Journal of Theoretical Biology,2013,336:231-239.
[16]SINGH G,DHOLE K,PAI P P,et al.SPRINGS:prediction of protein-protein interaction sites using artificial neural networks [R].PeerJ PrePrints,2014.
[17]WANG B,CHEN P,WANG P,et al.Radial basis function neural network ensemble for predicting protein-protein interaction sites in heterocomplexes [J].Protein and Peptide Letters,2010,17(9):1111-1116.
[18]KOIKE A,TAKAGI T.Prediction of protein-protein interaction sites using support vector machines [J].Protein Engineering Design and Selection,2004,17(2):165-173.
[19]WANG X,YU B,MA A,et al.Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique [J].Bioinformatics,2019,35(14):2395-2402.
[20]ZENG M,ZHANG F,WU F X,et al.Protein-protein interaction site prediction through combining local and global features with deep neural networks [J].Bioinformatics,2020,36(4):1114-1120.
[21]ZHANG B,LI J,QUAN L,et al.Sequence-based prediction of protein-protein interaction sites by simplified long short-term memory network [J].Neurocomputing,2019,357:86-100.
[22]LU S,LI Y,NAN X,et al.Attention-based convolutional neural networks for protein-protein interaction site prediction [C]//2021 IEEE International Conference on Bioinformatics and Biomedicine(BIBM).IEEE,2021:141-144.
[23]CONG H,LIU H,CAO Y,et al.Protein-protein interaction site prediction by modelensembling with hybrid feature and self-attention [J].BMC Bioinformatics,2023,24(1):456.
[24]WANG X,YU B,MA A,et al.Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique [J].Bioinformatics,2019,35(14):2395-2402.
[25]JOOSTEN R P,TE BEEK T A H,KRIEGER E,et al.A series of PDB related databases for everyday needs [J].Nucleic Acids Research,2010,39(suppl_1):D411-D419.
[26]KABSCH W,SANDER C.Dictionary of protein secondarystructure:pattern recognition of hydrogen-bonded and geometrical features [J].Biopolymers:Original Research on Biomolecules,1983,22(12):2577-2637.
[27]WANG J,YANG B,REVOTE J,et al.POSSUM:a bioinformatics toolkit for generating numerical sequence feature descriptors based onPSSM profiles [J].Bioinformatics,2017,33(17):2756-2758.
[28]WODAK S J,VLASBLOM J,TURINSKY A L,et al.Protein-protein interaction networks:the puzzling riches[J].Current Opinion in Structural Biology,2013,23(6):941-953.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!