Computer Science ›› 2025, Vol. 52 ›› Issue (5): 161-170.doi: 10.11896/jsjkx.240300110

• Database & Big Data & Data Science • Previous Articles     Next Articles

Cancer Pathogenic Gene Prediction Based on Differential Co-expression Adjacent Network

LI Zhijie1, LIAO Xuhong1, LI Qinglan2, LIU Li3   

  1. 1 School of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang,Hunan 414006,China
    2 Medical College,University of Pennsylvania,Philadelphia 19019,USA
    3 Medical College,Virginia Commonwealth University,Richmond 23284,USA
  • Received:2024-03-18 Revised:2024-07-29 Online:2025-05-15 Published:2025-05-12
  • About author:LI Zhijie,born in 1964,Ph.D,associate professor.His main research interests include computational biology,online learning of big data,and data mining.
    LIAO Xuhong,born in 1997,master.Her main research interests include computational biology and data mining.
  • Supported by:
    National Natural Science Foundation of China(62072475,61672391) and Hunan Provincial Natural Science Foundation,China(2019JJ40111).

Abstract: Cancer is the first killer of human health.With the rapid development of sequencing technology,a massive amount of cancer gene expression data has been accumulated,and using computational methods to predict pathogenic genes has become a new hotspot in cancer research.However,currently,the prediction of pathogenic genes is mostly based on gene interaction networks,and little consideration is given to the potential connection between local network connections and differential gene expression.In response to the above issues,this paper first utilizes gene expression difference data before and after the disease,calculates the correlation between genes through mutual information,and constructs an adjacency network.Then,a feature vector model is designed for predicting cancer pathogenic genes.Vector features include differential expression information of candidate genes and their neighbors.Cancer-related pathogenic and non pathogenic genes are obtained from public databases such as TCGA,OMIM,and GEO,as well as differential expression data of genes before and after illness,for experiments.Differential expression information of genes and their neighbors in adjacency networks are used for cancer pathogenic gene prediction(DICPG).The experimental results show that the DICPG cancer gene classification model has significant biological significance,and its classification accuracy and AUC performance indicators are superior to similar methods.

Key words: Gene differential expression data, Adjacent network, Candidate gene, Gene feature vector, Cancer pathogenic gene prediction

CLC Number: 

  • TP181
[1]YANG S F,CHANG C W,WEI R J,et al.Involvement of DNA Damage Response Pathways in Hepatocellular Carcinoma[J].BioMed Research International,2014,16:283-291.
[2]ZHANG X,ZOU Q,RODRIGUEZ-PATON A,et al.Meta-Path Methods for Prioritizing Candidate Disease MiRNAs[J].IEEE ACM Trans.Comput.Biol.Bioinf.,2019,16:283-291.
[3]LI X,CHANG M,WANG L.Information Recognition of Pathogenic Modules in Gene Statistics of Big Data[J].Nanomater Energy,2021,10:35-42.
[4]COLLIER O,STOVEN V,VERT J P.LOTUS:a Single andMultitask Machine Learning Algorithm for The Prediction of Cancer Driver Genes[J].PLoS.Comput.Biol.,2019,15:100-108.
[5]LUO P,DING Y,LEI X,et al.deepDriveer:Predicting Cancer Driver Genes Based on Somatic Mutations Using Deep Convolutional Neural Networks[J].Front Genet.,2019,15:12-19.
[6]LIU X,TANG W H,ZHAO X M,et al.A Network Approach to Predict Pathogenic Genes for Fusarium Graminearum[J].PLoS ONE,2010,5:e13021.
[7]BOLDI P,SANTINI M,VIGNA S.PageRank as A Function ofThe Damping Factor[C]//Proceedings of The 14th InternationalConference on World Wide Web.2005:557-566.
[8]CHAKRABARTI S,DOM B E,KUMAR S R,et al.Mining The Web's Link Structure[J].Computer,1999,32(8):60-67.
[9]NEWMAN M E J.Modularity and Community Structure in Networks[J].National Academy of Sciences,2006,103(23):8577-8582.
[10]ZITNIK M,SOSIC R,LESKOVEC J.Prioritizing NetworkCommunities [J].Nature Communications,2018,9(1):1-9.
[11]SHANG H X,LIU Z P.Prioritizing Type 2 Diabetes Genes by Weighted PageRank on Bilayer Heterogeneous Networks[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2021,18(1):336-346.
[12]PONTES B,GIRALDEZ R,AGUILAR-RUIZ J S.Biclusteringon Expression Data:A Review[J].Journal of Biomedical Informatics,2015,57(6):163-180.
[13]CHENG L,YANG H,ZHAO H,et al.MetSigDis:A Mamually Curated Resource for The Metabolic Signatures of Diseases[J].Briefings BioInf.,2019,20:203-209.
[14]POTTINGER T D,PUCKELWARTZ M J,PESCE L L,et al.Pathogenic and Uncertain Genetic Variants Have Clinical Cardiac Correlates in Diverse Biobank Participants[J].J.Am.Heart.Assoc.,2020,9:26.
[15]ZOU Y,HUI R,SONG L.The Era of Clinical Application of Gene Diagnosis in Cardiovascular Diseases Is Coming[J].Chronic.Dis.Transl.Med.,2019,5:214-220.
[16]TIMILSINA M,YANG H,SAHAY R,et al.Predicting LinksBetween Tumor Samples and Using 2-Layered Graph Based Diffusion Approach[J].BMC Bioinf.,2019,20:1-20.
[17]XU B,LIU Y,YU S,et al.A Network Embedding Model for Pathogenic Genes Prediction by Multi-Path Random Walking on Heterogeneous Network[J].BMC Med Genomics,2019,12:188.
[18]ZHANG H P,WANG H N,LU G M,et al.Finding Differentially Co-Expressed Disease-Related Genes Based on Mutual Information[J].Journal of Southeast University(Natural Science Edition),2009,39:151-155.
[19]YU L,REN S J.Prediction of Cancerous Pathogenic GenesBased on Network and Gene Differential Expression Information[J].Scientia Sinica Vitae,2023,53(1):94-108.
[20]SHANNON C E.A Mathematical Theory of Communication[J].The Bell System Technical Journal,1948,27:379-423.
[21]WANG L,CHEN P,CHEN S,et al.A Novel Approach to Fully Representing The Diversity in Conditional Dependencies for Learning Bayesian Network Classifier[J].Intelligent Data Ana-lysis,2021,25(11):35-55.
[22]DUAN Z,WANG L,CHEN S,et al.Instance-Based Weighting Filter for Superparent One-Dependence Estimators[J].Know-ledge-Based Systems,2020,203(8):106-115.
[23]CABUZ S,ABREU G.Causal Inference for Multivariate Stochastic Process Prediction[J].Information Sciences,2018,448(12):134-148.
[24]SUN J,TAYLOR D,BOLLT E M.Causal Network Inference by Optimal Causation Entropy[J].SIAM Journal on Applied Dynamical Systems,2015,14(3):73-106.
[25]CHUA H N,SUNG W K,WONG L.Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein—Protein Interactions[J].Bioinformatics,2006,22(13):1623-1630.
[26]SHAIK J S,YEASIN M.A Unified Framework for Finding Differentially Expressed Genes from Microarray Experiments[J].BMC Bioinformatics,2007,8:347.
[27]LI X,RAO S,WANG Y,et al.Gene Mining:A Novel And Power-ful Ensemble Decision Approach to Hunting for Genes Using Microarray Expression Profiling[J].Nucleic Acids Research,2004,32(9):2685-2694.
[28]DIAO Q,HU W,ZHONG H,et al.Disease Gene Explorer:Display Disease Gene Dependency by Combining Bayesian Networks with Clustering[C]//Proceedings of The IEEE Computational Systems Bioinformatics Conference.Stanford,USA,2004:574-575.
[29]ZHANG X W,YAP Y L,WEI D,et al.Molecular Diagnosis of Human Cancer Type by Gene Expression Profiles and Indepen-dent Component Analysis[J].European Journal of Human Genetics,2005,13(12):1303-1311.
[1] HU Libin, ZHANG Yunfeng, LIU Peide. Synthetic Oversampling Method Based Noiseless Gradient Distribution [J]. Computer Science, 2025, 52(9): 220-231.
[2] ZHU Rui, YE Yaqin, LI Shengwen, TANG Zijian, XIAO Yue. Dynamic Community Detection with Hierarchical Modularity Optimization [J]. Computer Science, 2025, 52(8): 127-135.
[3] JIANG Rui, FAN Shuwen, WANG Xiaoming, XU Youyun. Clustering Algorithm Based on Improved SOM Model [J]. Computer Science, 2025, 52(8): 162-170.
[4] ZENG Xinran, LI Tianrui, LI Chongshou. Active Learning for Point Cloud Semantic Segmentation Based on Dynamic Balance and DistanceSuppression [J]. Computer Science, 2025, 52(8): 180-187.
[5] FU Wenhao, GE Liyong, WANG Wen, ZHANG Chun. Multi-UAV Path Planning Algorithm Based on Improved Dueling-DQN [J]. Computer Science, 2025, 52(8): 326-334.
[6] LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150.
[7] LI Jiawei , DENG Yuandan, CHEN Bo. Domain UML Model Automatic Construction Based on Fine-tuning Qwen2 [J]. Computer Science, 2025, 52(6A): 240900155-4.
[8] CHEN Xianglong, LI Haijun. LST-ARBunet:An Improved Deep Learning Algorithm for Nodule Segmentation in Lung CT Images [J]. Computer Science, 2025, 52(6A): 240600020-10.
[9] HUANG Ao, LI Min, ZENG Xiangguang, PAN Yunwei, ZHANG Jiaheng, PENG Bei. Adaptive Hybrid Genetic Algorithm Based on PPO for Solving Traveling Salesman Problem [J]. Computer Science, 2025, 52(6A): 240600096-6.
[10] SUN Yongqian, TANG Shouguo. Prediction of Moisture Content and Temperature of Tobacco Leaf Re-curing Outlet Based onImproved DBO-BP Neural Network [J]. Computer Science, 2025, 52(6A): 240900069-7.
[11] GAO Xinjun, ZHANG Meixin, ZHU Li. Study on Short-time Passenger Flow Data Generation and Prediction Method for RailTransportation [J]. Computer Science, 2025, 52(6A): 240600017-5.
[12] DU Yuanhua, CHEN Pan, ZHOU Nan, SHI Kaibo, CHEN Eryang, ZHANG Yuanpeng. Correntropy Based Multi-view Low-rank Matrix Factorization and Constraint Graph Learning for Multi-view Data Clustering [J]. Computer Science, 2025, 52(6A): 240900131-10.
[13] BAO Shenghong, YAO Youjian, LI Xiaoya, CHEN Wen. Integrated PU Learning Method PUEVD and Its Application in Software Source CodeVulnerability Detection [J]. Computer Science, 2025, 52(6A): 241100144-9.
[14] HUANG Xiaoyu, JIANG Hemeng, LING Jiaming. Privacy Preservation of Crowdsourcing Content Based on Adversarial Generative Networks [J]. Computer Science, 2025, 52(6A): 250200123-7.
[15] SUN Jinyong, WANG Xuechun, CAI Guoyong, SHANG Zhiliang. Open Set Recognition Based on Meta Class Incremental Learning [J]. Computer Science, 2025, 52(5): 187-198.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!