Computer Science ›› 2025, Vol. 52 ›› Issue (12): 102-114.doi: 10.11896/jsjkx.250900062

• Database & Big Data & Data Science • Previous Articles     Next Articles

Protein Complex Identification Algorithm Based on Hypergraph Network Embedding

WANG Jie1, YANG Xiancan1, ZHAO Xingwang2   

  1. 1 School of Information, Shanxi University of Finance and Economics, Taiyuan 030006, China
    2 School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
  • Received:2025-09-09 Revised:2025-11-10 Online:2025-12-15 Published:2025-12-09
  • About author:WANG Jie,born in 1988,Ph.D,asso-ciate professor,is a member of CCF(No.N2805M).His main research interests include data mining and bioinformatics,etc.
    YANG Xiancan,born in 2000,master,is a member of CCF(No.T8922G).His main research interests include data mining and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(62006145) and Fundamental Research Program of Shanxi Province(202303021212169).

Abstract: Protein complexes are crucial for understanding cellular functions and identifying biological processes,playing critical roles in cell biology.The use of network clustering in PPI networks to identify protein complexes has become a hot research topic in data mining and bioinformatics.A variety of computational methods have emerged to identify protein complexes.However,most existing algorithms primarily use original network to detect dense subnetworks and fail to break through the limitations of traditional graph structures for multi-node interactions.Aiming at the issue of many-to-many complex interaction characteristics prevalent in biological networks,this paper proposes a novel protein complex identification method based on hypergraph network embedding(PCIHNE).Through the ability of hypergraph networks,it firstly converts the original PPI network into a hypergraph network.Then a hierarchical compression strategy recursively compresses the hypergraph into multiple smaller hypergraphs at different levels,thereby constructing a multi-scale analysis framework.Next,hypergraph convolution is performed on each levels to generate node representations at different granularities.These node representations are concatenated to obtain the complete node representation.Based on the representations obtained from hypergraph learning,a weighted PPI network is constructed by similarity on the original network.Finally,a core-attachment based strategy is used to obtain predicted protein complexes in the weighted PPI network.It evaluates the effectiveness of PCIHNE by comparing it with other protein complex algorithms on multiple yeast and human datasets.Experimental results demonstrate that PCIHNE is better than comparison protein complex identification methods regarding F-measure and Accuracy metrics.

Key words: Protein-protein interaction network, Protein complexes, Hypergraphs, Network embedding, Network clustering

CLC Number: 

  • TP399
[1]ZHANG Y,JIAK B,ZHANGA D.Consistent protein functional module detection from multi-view of biological data[J].Acta Electronica Sinica,2014,42(12):2337-2344.
[2]WU Z,WANG Y,CHEN L.Network-based drug repositioning[J].Molecular BioSystems,2013,9(6):1268-1281.
[3]GÖBL C,MADL T,SIMON B,et al.NMR approaches for structural analysis of multidomain proteins and complexes in solution[J].Progress in Nuclear Magnetic Resonance Spectroscopy,2014,80:26-63.
[4]WALZTHOENI T,LEITNER A,STENGEL F,et al.Massspectrometry supported determination of protein complex structure[J].Current Opinion in Structural Biology,2013,23(2):252-260.
[5]ALBERTS B.The cell as a collection of protein machines:preparing the next generation of molecular biologists[J].Cell,1998,92(3):291-294.
[6]DUNHAM B,GANAPATHIRAJU M K.Benchmark evaluation of protein-protein interaction prediction algorithms[J].Molecules,2021,27(1):41.
[7]HUA Y,LI J X,FENG Z H,et al.Protein-drug interaction pre-diction based on attention feature fusion[J].Journal of Compu-ter Research and Development,2022,59(9):2051-2065.
[8]CAO H T,CHEN J.Prediction of multitype protein interactions combining Doc2vec and GCN[J].CAAI Transactions on Intelligent Systems,2023,18(6):1165-1172.
[9]LI Z J,CHEN Y M,LIU J W,et al.A survey of computational method in protein-protein interaction research[J].Journal of Computer Research and Development,2008,45(12):2129-2137.
[10]PAN Y L,GUAN J H,YAO H,et al.Computational methods for protein complex prediction:A survey[J].Journal of Frontiers of Computer Science and Technology,2022,16(1):1-20.
[11]GAO Y,FENG Y,JI S,et al.HGNN+:General hypergraphneural networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(3):3181-3199.
[12]SHANG J L,ZHANG Z Y,QU W W,et al.Survey of graphpartitioning techniques for distributed graph computing[J].Journal of Computer Research and Development,2025,62(1):90-103.
[13]VLASBLOM J,WODAK S J.Markov clustering versus affinity propagation for the partitioning of protein interaction graphs[J].BMC Bioinformatics,2009,10(1):1-14.
[14]SHIH Y K,PARTHASARATHY S.Identifying functionalmodules in interaction networks through overlapping Markov clustering[J].Bioinformatics,2012,28(18):473-479.
[15]LYU J,YAO Z,LIANG B,et al.Small protein complex prediction algorithm based on protein-protein interaction network segmentation[J].BMC Bioinformatics,2022,23(1):1-20.
[16]WANG C,WANG R,JIANG K.Amethod for detecting overlapping protein complexes based on an adaptive improved FCM clustering algorithm[J].Mathematics,2025,13(2):196.
[17]NEPUSZ T,YU H,PACCANARO A.Detecting overlappingprotein complexes in protein-proteininteraction networks[J].Nature Methods,2012,9(5):471-472.
[18]ALTAF-UL-AMIN M,SHINBO Y,MIHARA K,et al.Deve-lopment and implementation of an algorithm for detection of protein complexes in large interaction networks[J].BMC Bioinformatics,2006,7(1):1-13.
[19]SAHOO T R,PATRA S,VIPSITA S.Decision tree classifier based on topological characteristics of subgraph for the mining of protein complexes from large scale PPI networks[J].Computational Biology and Chemistry,2023,106:107935.
[20]GAVIN A C,ALOY P,GRANDI P,et al.Proteome survey reveals modularity of the yeast cell machinery[J].Nature,2006,440(7084):631-636.
[21]LEUNG H C M,XIANG Q,YIU S M,et al.Predicting protein complexes from PPI data:a core-attachment approach[J].Journal of Computational Biology,2009,16(2):133-144.
[22]WU M,LI X,KWOH C K,et al.A core-attachment based me-thod to detect protein complexes in PPI networks[J].BMC Bioinformatics,2009,10(1):1-16.
[23]KOUHSAR M,ZARE-MIRAKABAD F,JAMALI Y.WCO-ACH:protein complex prediction in weighted PPI networks[J].Genes & Genetic Systems,2015,90(5):317-324.
[24]PENG W,WANG J,ZHAO B,et al.Identification of protein complexes using weighted pagerank-nibble algorithm and core-attachment structure[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2014,12(1):179-192.
[25]MUKHOPADHYAY A,RAY S,MAULIK U,et al.Multiobjective approach to protein complex detection[M]//MultiobjectiveOptimization Algorithms for Bioinformatics.Singapore:Sprin-ger,2024:171-193.
[26]WANG J,LIANG J Y,ZHAO X W,et al.Overlapping protein complexes detection algorithm based on assortativity in PPI networks[J].Computer Science,2019,46(2):294-300.
[27]WANG J,JIA Y,SANGAIAH A K,et al.A network clustering algorithm for protein complex detection fused with power-Law distribution characteristic[J].Electronics,2023,12(14):3007.
[28]XU M.Understanding graph embedding methods and their applications[J].Siam Review,2021,63(4):825-853.
[29]WANG R,MA H,WANG C.An ensemble learning framework for detecting protein complexes from PPI networks[J].Frontiers in genetics,2022,13:839949.
[30]MENG X,XIANG J,ZHENG R,et al.DPCMNE:Detecting protein complexes from protein-protein interaction networks via multi-level network embedding[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2021,19(3):1592-1602.
[31]CHEN H,CAI Y,JI C,et al.AdaPPI:Identification of novel protein functional modules via adaptive graph convolution networks in a protein-protein interaction network[J].Briefings in Bioinformatics,2023,24(1):523.
[32]WANG S,CUI H,QU Y,et al.Multi-source biological know-ledge-guided hypergraph spatiotemporal subnetwork embedding for protein complex identification[J].Briefings in Bioinforma-tics,2025,26(1):718.
[33]XIA S,LI D,DENG X,et al.Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes[J].Briefings in Bioinformatics,2024,25(4):274.
[34]FU G,HOU C,YAO X.Learning topological representation for networks via hierarchical sampling[C]//2019 International Joint Conference on Neural Networks(IJCNN).IEEE,2019:1-8.
[35]KUMAR T,VAIDYANATHAN S,ANANTHAPADMANABHAN H,et al.Hypergraph clustering by iteratively reweighted modularity maximization[J].Applied Network Science,2020,5(1):52.
[36]XIANG N,YOU M,WANG Q,et al.Hypergraph network embedding for community detection[J].The Journal of Supercomputing,2024,80(10):14180-14202.
[37]GAVIN A C,BÖSCHE M,KRAUSE R,et al.Functional orga-nization of the yeast proteome by systematic analysis of protein complexes[J].Nature,2002,415(6868):141-147.
[38]GAVIN A C,ALOY P,GRANDI P,et al.Proteome survey reveals modularity of the yeast cell machinery[J].Nature,2006,440(7084):631-636.
[39]KROGAN N J,CAGNEY G,YU H,et al.Global landscape of protein complexes in the yeast Saccharomyces cerevisiae[J].Nature,2006,440(7084):637-643.
[40]XENARIOS I,SALWINSKI L,DUAN X J,et al.DIP,thedatabase of interacting proteins:a research tool for studying cellular networks of protein interactions[J].Nucleic Acids Research,2002,30(1):303-305.
[41]STARK C,BREITKREUTZ B J,REGULY T,et al.BioGRID:A general repository for interaction datasets[J].Nucleic Acids Research,2006,34(1):535-539.
[42]SZKLARCZYK D,GABLE A L,LYON D,et al.STRING v11:Protein-protein association networks with increased coverage,supporting functional discovery in genome-wide experimental datasets[J].Nucleic Acids Research,2019,47(1):607-613.
[43]PU S,WONG J,TURNER B,et al.Up-to-date catalogues ofyeast protein complexes[J].Nucleic Acids Research,2009,37(3):825-831.
[44]BROHEE S,VAN HELDEN J.Evaluation of clustering algo-rithms for protein-protein interaction networks[J].BMC Bioinformatics,2006,7(1):1-19.
[45]GIURGIU M,REINHARD J,BRAUNER B,et al.CORUM:The comprehensive resource of mammalian protein complexes-2019[J].Nucleic Acids Research,2019,47(1):559-563.
[46]IVAZEH A,ZAHIRI J,RAHGOZAR M,et al.Performanceevaluation measures for protein complex prediction[J].Geno-mics,2019,111(6):1483-1492.
[47]OMRANIAN S,ANGELESKA A,NIKOLOSKI Z.PC2P:Parameter-free network-based prediction of protein complexes[J].Bioinformatics,2021,37(1):73-81.
[48]ZOU M,GAN Z,WANG Y,et al.Unig-encoder:A universal feature encoder for graph and hypergraph node classification[J].Pattern Recognition,2024,147:110115.
[49]LIU Z,TANG B,YE Z,et al.Hypergraph transformer for semi-supervised classification[C]//ICASSP 2024-2024 IEEEInternational Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2024:7515-7519.
[50]WANG R,LIU G,WANG C.Identifying protein complexesbased on an edge weight algorithm and core-attachment structure[J].BMC Bioinformatics,2019,20(1):1-20.
[51]HANNA E M,ZAKI N.Detecting protein complexes in protein interaction networks using a ranking algorithm with a refined merging procedure[J].BMC Bioinformatics,2014,15(1):1-11.
[52]JIANG P,SINGH M.SPICi:a fast clustering algorithm for large biological networks[J].Bioinformatics,2010,26(8):1105-1111.
[53]LIU G,WONG L,CHUA H N.Complex discovery from weighted PPI networks[J].Bioinformatics,2009,25(15):1891-1897.
[54]LI M,CHEN J,WANG J,et al.Modifying the DPClus algorithm for identifying protein complexes based on new topological structures[J].BMC Bioinformatics,2008,9(1):1-16.
[55]BADER G D,HOGUE C W V.An automated method for fin-ding molecular complexes in large protein interaction networks[J].BMC Bioinformatics,2003,4(1):1-27.
[56]ZAKI N,EFIMOV D,BERENGUERES J.Protein complex detection using interaction reliability assessment and weighted clustering coefficient[J].BMC Bioinformatics,2013,14(1):1-9.
[57]PELLEGRINI M,BAGLIONI M,GERACI F.Protein complex prediction for large protein protein interaction networks with the Core&Peel method[J].BMC Bioinformatics,2016,17(12):37-58.
[1] YANG Dongsheng, WANG Guiling, ZHENG Xin. Hierarchical Hypergraph-based Attention Neural Network for Service Recommendation [J]. Computer Science, 2024, 51(11): 103-111.
[2] ZENG Xiangyu, LONG Haixia, YANG Xuhua. Community Detection Based on Markov Similarity Enhancement and Network Embedding [J]. Computer Science, 2023, 50(4): 56-62.
[3] ZHANG Qi, PAN Ke, ZHU Kai. Method for Identifying Active Module Based on Gene Prioritization [J]. Computer Science, 2023, 50(11A): 221200113-8.
[4] ZHENG Wenping, WANG Fumin, LIU Meilin, YANG Gui. Graph Clustering Algorithm Based on Node Clustering Complexity [J]. Computer Science, 2023, 50(11): 77-87.
[5] CHEN Shi-cong, YUAN De-yu, HUANG Shu-hua, YANG Ming. Node Label Classification Algorithm Based on Structural Depth Network Embedding Model [J]. Computer Science, 2022, 49(3): 105-112.
[6] GUO Lei, MA Ting-huai. Friend Closeness Based User Matching [J]. Computer Science, 2022, 49(3): 113-120.
[7] YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia. Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding [J]. Computer Science, 2022, 49(3): 121-128.
[8] TANG Qi-you, ZHANG Feng-li, WANG Rui-jin, WANG Xue-ting, ZHOU Zhi-yuan, HAN Ying-jun. Method of Attributed Heterogeneous Network Embedding with Multiple Features [J]. Computer Science, 2022, 49(12): 146-154.
[9] ZHENG Su-su, GUAN Dong-hai, YUAN Wei-wei. Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion [J]. Computer Science, 2021, 48(9): 68-76.
[10] HU Xin-tong, SHA Chao-feng, LIU Yan-jun. Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis [J]. Computer Science, 2021, 48(5): 124-129.
[11] YANG Xu-hua, WANG Chen. Community Detection Algorithm in Complex Network Based on Network Embedding and Local Resultant Force [J]. Computer Science, 2021, 48(4): 229-236.
[12] ZHANG Jian-xiong, SONG Kun, HE Peng, LI Bing. Identification of Key Classes in Software Systems Based on Graph Neural Networks [J]. Computer Science, 2021, 48(12): 149-158.
[13] XU Xin-li, XIAO Yun-yue, LONG Hai-xia, YANG Xu-hua, MAO Jian-fei. Attributed Network Embedding Based on Matrix Factorization and Community Detection [J]. Computer Science, 2021, 48(12): 204-211.
[14] DING Yu, WEI Hao, PAN Zhi-song, LIU Xin. Survey of Network Representation Learning [J]. Computer Science, 2020, 47(9): 52-59.
[15] ZHU Guo-hui, ZHANG Yin, LIU Xiu-xia, SUN Tian-ao. Energy Efficient Virtual Network Mapping Algorithms Based on Node Topology Awareness [J]. Computer Science, 2020, 47(9): 270-274.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!