Computer Science ›› 2025, Vol. 52 ›› Issue (5): 161-170.doi: 10.11896/jsjkx.240300110

• Database & Big Data & Data Science • Previous Articles     Next Articles

Cancer Pathogenic Gene Prediction Based on Differential Co-expression Adjacent Network

LI Zhijie1, LIAO Xuhong1, LI Qinglan2, LIU Li3   

  1. 1 School of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang,Hunan 414006,China
    2 Medical College,University of Pennsylvania,Philadelphia 19019,USA
    3 Medical College,Virginia Commonwealth University,Richmond 23284,USA
  • Received:2024-03-18 Revised:2024-07-29 Online:2025-05-15 Published:2025-05-12
  • About author:LI Zhijie,born in 1964,Ph.D,associate professor.His main research interests include computational biology,online learning of big data,and data mining.
    LIAO Xuhong,born in 1997,master.Her main research interests include computational biology and data mining.
  • Supported by:
    National Natural Science Foundation of China(62072475,61672391) and Hunan Provincial Natural Science Foundation,China(2019JJ40111).

Abstract: Cancer is the first killer of human health.With the rapid development of sequencing technology,a massive amount of cancer gene expression data has been accumulated,and using computational methods to predict pathogenic genes has become a new hotspot in cancer research.However,currently,the prediction of pathogenic genes is mostly based on gene interaction networks,and little consideration is given to the potential connection between local network connections and differential gene expression.In response to the above issues,this paper first utilizes gene expression difference data before and after the disease,calculates the correlation between genes through mutual information,and constructs an adjacency network.Then,a feature vector model is designed for predicting cancer pathogenic genes.Vector features include differential expression information of candidate genes and their neighbors.Cancer-related pathogenic and non pathogenic genes are obtained from public databases such as TCGA,OMIM,and GEO,as well as differential expression data of genes before and after illness,for experiments.Differential expression information of genes and their neighbors in adjacency networks are used for cancer pathogenic gene prediction(DICPG).The experimental results show that the DICPG cancer gene classification model has significant biological significance,and its classification accuracy and AUC performance indicators are superior to similar methods.

Key words: Gene differential expression data, Adjacent network, Candidate gene, Gene feature vector, Cancer pathogenic gene prediction

CLC Number: 

  • TP181
[1]YANG S F,CHANG C W,WEI R J,et al.Involvement of DNA Damage Response Pathways in Hepatocellular Carcinoma[J].BioMed Research International,2014,16:283-291.
[2]ZHANG X,ZOU Q,RODRIGUEZ-PATON A,et al.Meta-Path Methods for Prioritizing Candidate Disease MiRNAs[J].IEEE ACM Trans.Comput.Biol.Bioinf.,2019,16:283-291.
[3]LI X,CHANG M,WANG L.Information Recognition of Pathogenic Modules in Gene Statistics of Big Data[J].Nanomater Energy,2021,10:35-42.
[4]COLLIER O,STOVEN V,VERT J P.LOTUS:a Single andMultitask Machine Learning Algorithm for The Prediction of Cancer Driver Genes[J].PLoS.Comput.Biol.,2019,15:100-108.
[5]LUO P,DING Y,LEI X,et al.deepDriveer:Predicting Cancer Driver Genes Based on Somatic Mutations Using Deep Convolutional Neural Networks[J].Front Genet.,2019,15:12-19.
[6]LIU X,TANG W H,ZHAO X M,et al.A Network Approach to Predict Pathogenic Genes for Fusarium Graminearum[J].PLoS ONE,2010,5:e13021.
[7]BOLDI P,SANTINI M,VIGNA S.PageRank as A Function ofThe Damping Factor[C]//Proceedings of The 14th InternationalConference on World Wide Web.2005:557-566.
[8]CHAKRABARTI S,DOM B E,KUMAR S R,et al.Mining The Web's Link Structure[J].Computer,1999,32(8):60-67.
[9]NEWMAN M E J.Modularity and Community Structure in Networks[J].National Academy of Sciences,2006,103(23):8577-8582.
[10]ZITNIK M,SOSIC R,LESKOVEC J.Prioritizing NetworkCommunities [J].Nature Communications,2018,9(1):1-9.
[11]SHANG H X,LIU Z P.Prioritizing Type 2 Diabetes Genes by Weighted PageRank on Bilayer Heterogeneous Networks[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2021,18(1):336-346.
[12]PONTES B,GIRALDEZ R,AGUILAR-RUIZ J S.Biclusteringon Expression Data:A Review[J].Journal of Biomedical Informatics,2015,57(6):163-180.
[13]CHENG L,YANG H,ZHAO H,et al.MetSigDis:A Mamually Curated Resource for The Metabolic Signatures of Diseases[J].Briefings BioInf.,2019,20:203-209.
[14]POTTINGER T D,PUCKELWARTZ M J,PESCE L L,et al.Pathogenic and Uncertain Genetic Variants Have Clinical Cardiac Correlates in Diverse Biobank Participants[J].J.Am.Heart.Assoc.,2020,9:26.
[15]ZOU Y,HUI R,SONG L.The Era of Clinical Application of Gene Diagnosis in Cardiovascular Diseases Is Coming[J].Chronic.Dis.Transl.Med.,2019,5:214-220.
[16]TIMILSINA M,YANG H,SAHAY R,et al.Predicting LinksBetween Tumor Samples and Using 2-Layered Graph Based Diffusion Approach[J].BMC Bioinf.,2019,20:1-20.
[17]XU B,LIU Y,YU S,et al.A Network Embedding Model for Pathogenic Genes Prediction by Multi-Path Random Walking on Heterogeneous Network[J].BMC Med Genomics,2019,12:188.
[18]ZHANG H P,WANG H N,LU G M,et al.Finding Differentially Co-Expressed Disease-Related Genes Based on Mutual Information[J].Journal of Southeast University(Natural Science Edition),2009,39:151-155.
[19]YU L,REN S J.Prediction of Cancerous Pathogenic GenesBased on Network and Gene Differential Expression Information[J].Scientia Sinica Vitae,2023,53(1):94-108.
[20]SHANNON C E.A Mathematical Theory of Communication[J].The Bell System Technical Journal,1948,27:379-423.
[21]WANG L,CHEN P,CHEN S,et al.A Novel Approach to Fully Representing The Diversity in Conditional Dependencies for Learning Bayesian Network Classifier[J].Intelligent Data Ana-lysis,2021,25(11):35-55.
[22]DUAN Z,WANG L,CHEN S,et al.Instance-Based Weighting Filter for Superparent One-Dependence Estimators[J].Know-ledge-Based Systems,2020,203(8):106-115.
[23]CABUZ S,ABREU G.Causal Inference for Multivariate Stochastic Process Prediction[J].Information Sciences,2018,448(12):134-148.
[24]SUN J,TAYLOR D,BOLLT E M.Causal Network Inference by Optimal Causation Entropy[J].SIAM Journal on Applied Dynamical Systems,2015,14(3):73-106.
[25]CHUA H N,SUNG W K,WONG L.Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein—Protein Interactions[J].Bioinformatics,2006,22(13):1623-1630.
[26]SHAIK J S,YEASIN M.A Unified Framework for Finding Differentially Expressed Genes from Microarray Experiments[J].BMC Bioinformatics,2007,8:347.
[27]LI X,RAO S,WANG Y,et al.Gene Mining:A Novel And Power-ful Ensemble Decision Approach to Hunting for Genes Using Microarray Expression Profiling[J].Nucleic Acids Research,2004,32(9):2685-2694.
[28]DIAO Q,HU W,ZHONG H,et al.Disease Gene Explorer:Display Disease Gene Dependency by Combining Bayesian Networks with Clustering[C]//Proceedings of The IEEE Computational Systems Bioinformatics Conference.Stanford,USA,2004:574-575.
[29]ZHANG X W,YAP Y L,WEI D,et al.Molecular Diagnosis of Human Cancer Type by Gene Expression Profiles and Indepen-dent Component Analysis[J].European Journal of Human Genetics,2005,13(12):1303-1311.
[1] CHEN Hongxiu, ZENG Xia, LIU Zhiming, ZHAO Hengjun. Automatic Theorem Proving Based on Pre-trained Language Models and Unification [J]. Computer Science, 2026, 53(4): 40-47.
[2] WU Yansheng, CAO Xinyi, FAN Weibei. Research on Efficient Construction of Plateaued Functions Based on DQN-enhanced Genetic Algorithm [J]. Computer Science, 2026, 53(4): 57-65.
[3] HUANG Beibei, LIU Jinfeng. Causal Disentangled Representation Learning with Integrated Sparse Coding [J]. Computer Science, 2026, 53(4): 66-77.
[4] ZHANG Xueqin, WANG Zhineng, LI Jinsheng, LU Yisong, LUO Fei. Key Node Identification in Temporal Social Networks Based on Deep Learning and Multi-feature Fusion [J]. Computer Science, 2026, 53(4): 143-154.
[5] QIN Haiqi, MI Jusheng. Concept-cognitive Learning and Incremental Learning in Complex Networks [J]. Computer Science, 2026, 53(4): 208-214.
[6] LIU Jiaqi, WANG Yujie, XIANG Guodu, YU Kui, CAO Fuyuan. Long-term Causal Effect Estimation Based on Deep Reinforcement Learning [J]. Computer Science, 2026, 53(4): 235-244.
[7] HUA Yu, ZHOU Xiaocheng, SHEN Xiangjun, LIU Zhifeng, ZHOU Conghua. Phase-preserved MinMax Framework for Graph Augmentation in Frequency Domain [J]. Computer Science, 2026, 53(4): 245-251.
[8] PAN Jiahao, FENG Xiang, YU Huiqun. SM-PHT:Robust,Scalable,and Efficient Method for Multi-task Reinforcement Learning [J]. Computer Science, 2026, 53(4): 366-376.
[9] GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[10] WANG Yiming, JIAO Min, ZHAO Suyun, CHEN Hong, LI Cuiping. Prompt-conditioned Representation Learning with Diffusion Models for Semi-supervised Clustering [J]. Computer Science, 2026, 53(3): 158-165.
[11] ZHAO Binbei, ZHU Li, ZHAO Hongli, LI Yutong. Computer Vision Applications in Rail Transit Systems [J]. Computer Science, 2026, 53(3): 214-224.
[12] JIA Shuheng, FU Huimin. Optimizing Probabilistic Choice for Solving SAT Problems [J]. Computer Science, 2026, 53(3): 366-374.
[13] HUANG Miaomiao, WANG Huiying, WANG Meixia, WANG Yejiang , ZHAO Yuhai. Review of Graph Embedding Learning Research:From Simple Graph to Complex Graph [J]. Computer Science, 2026, 53(1): 58-76.
[14] WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network [J]. Computer Science, 2026, 53(1): 231-240.
[15] DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni. Collaborative Semantics Fusion for Multi-agent Behavior Decision-making [J]. Computer Science, 2026, 53(1): 252-261.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!