Computer Science ›› 2025, Vol. 52 ›› Issue (5): 161-170.doi: 10.11896/jsjkx.240300110

• Database & Big Data & Data Science • Previous Articles     Next Articles

Cancer Pathogenic Gene Prediction Based on Differential Co-expression Adjacent Network

LI Zhijie1, LIAO Xuhong1, LI Qinglan2, LIU Li3   

  1. 1 School of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang,Hunan 414006,China
    2 Medical College,University of Pennsylvania,Philadelphia 19019,USA
    3 Medical College,Virginia Commonwealth University,Richmond 23284,USA
  • Received:2024-03-18 Revised:2024-07-29 Online:2025-05-15 Published:2025-05-12
  • About author:LI Zhijie,born in 1964,Ph.D,associate professor.His main research interests include computational biology,online learning of big data,and data mining.
    LIAO Xuhong,born in 1997,master.Her main research interests include computational biology and data mining.
  • Supported by:
    National Natural Science Foundation of China(62072475,61672391) and Hunan Provincial Natural Science Foundation,China(2019JJ40111).

Abstract: Cancer is the first killer of human health.With the rapid development of sequencing technology,a massive amount of cancer gene expression data has been accumulated,and using computational methods to predict pathogenic genes has become a new hotspot in cancer research.However,currently,the prediction of pathogenic genes is mostly based on gene interaction networks,and little consideration is given to the potential connection between local network connections and differential gene expression.In response to the above issues,this paper first utilizes gene expression difference data before and after the disease,calculates the correlation between genes through mutual information,and constructs an adjacency network.Then,a feature vector model is designed for predicting cancer pathogenic genes.Vector features include differential expression information of candidate genes and their neighbors.Cancer-related pathogenic and non pathogenic genes are obtained from public databases such as TCGA,OMIM,and GEO,as well as differential expression data of genes before and after illness,for experiments.Differential expression information of genes and their neighbors in adjacency networks are used for cancer pathogenic gene prediction(DICPG).The experimental results show that the DICPG cancer gene classification model has significant biological significance,and its classification accuracy and AUC performance indicators are superior to similar methods.

Key words: Gene differential expression data, Adjacent network, Candidate gene, Gene feature vector, Cancer pathogenic gene prediction

CLC Number: 

  • TP181
[1]YANG S F,CHANG C W,WEI R J,et al.Involvement of DNA Damage Response Pathways in Hepatocellular Carcinoma[J].BioMed Research International,2014,16:283-291.
[2]ZHANG X,ZOU Q,RODRIGUEZ-PATON A,et al.Meta-Path Methods for Prioritizing Candidate Disease MiRNAs[J].IEEE ACM Trans.Comput.Biol.Bioinf.,2019,16:283-291.
[3]LI X,CHANG M,WANG L.Information Recognition of Pathogenic Modules in Gene Statistics of Big Data[J].Nanomater Energy,2021,10:35-42.
[4]COLLIER O,STOVEN V,VERT J P.LOTUS:a Single andMultitask Machine Learning Algorithm for The Prediction of Cancer Driver Genes[J].PLoS.Comput.Biol.,2019,15:100-108.
[5]LUO P,DING Y,LEI X,et al.deepDriveer:Predicting Cancer Driver Genes Based on Somatic Mutations Using Deep Convolutional Neural Networks[J].Front Genet.,2019,15:12-19.
[6]LIU X,TANG W H,ZHAO X M,et al.A Network Approach to Predict Pathogenic Genes for Fusarium Graminearum[J].PLoS ONE,2010,5:e13021.
[7]BOLDI P,SANTINI M,VIGNA S.PageRank as A Function ofThe Damping Factor[C]//Proceedings of The 14th InternationalConference on World Wide Web.2005:557-566.
[8]CHAKRABARTI S,DOM B E,KUMAR S R,et al.Mining The Web's Link Structure[J].Computer,1999,32(8):60-67.
[9]NEWMAN M E J.Modularity and Community Structure in Networks[J].National Academy of Sciences,2006,103(23):8577-8582.
[10]ZITNIK M,SOSIC R,LESKOVEC J.Prioritizing NetworkCommunities [J].Nature Communications,2018,9(1):1-9.
[11]SHANG H X,LIU Z P.Prioritizing Type 2 Diabetes Genes by Weighted PageRank on Bilayer Heterogeneous Networks[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2021,18(1):336-346.
[12]PONTES B,GIRALDEZ R,AGUILAR-RUIZ J S.Biclusteringon Expression Data:A Review[J].Journal of Biomedical Informatics,2015,57(6):163-180.
[13]CHENG L,YANG H,ZHAO H,et al.MetSigDis:A Mamually Curated Resource for The Metabolic Signatures of Diseases[J].Briefings BioInf.,2019,20:203-209.
[14]POTTINGER T D,PUCKELWARTZ M J,PESCE L L,et al.Pathogenic and Uncertain Genetic Variants Have Clinical Cardiac Correlates in Diverse Biobank Participants[J].J.Am.Heart.Assoc.,2020,9:26.
[15]ZOU Y,HUI R,SONG L.The Era of Clinical Application of Gene Diagnosis in Cardiovascular Diseases Is Coming[J].Chronic.Dis.Transl.Med.,2019,5:214-220.
[16]TIMILSINA M,YANG H,SAHAY R,et al.Predicting LinksBetween Tumor Samples and Using 2-Layered Graph Based Diffusion Approach[J].BMC Bioinf.,2019,20:1-20.
[17]XU B,LIU Y,YU S,et al.A Network Embedding Model for Pathogenic Genes Prediction by Multi-Path Random Walking on Heterogeneous Network[J].BMC Med Genomics,2019,12:188.
[18]ZHANG H P,WANG H N,LU G M,et al.Finding Differentially Co-Expressed Disease-Related Genes Based on Mutual Information[J].Journal of Southeast University(Natural Science Edition),2009,39:151-155.
[19]YU L,REN S J.Prediction of Cancerous Pathogenic GenesBased on Network and Gene Differential Expression Information[J].Scientia Sinica Vitae,2023,53(1):94-108.
[20]SHANNON C E.A Mathematical Theory of Communication[J].The Bell System Technical Journal,1948,27:379-423.
[21]WANG L,CHEN P,CHEN S,et al.A Novel Approach to Fully Representing The Diversity in Conditional Dependencies for Learning Bayesian Network Classifier[J].Intelligent Data Ana-lysis,2021,25(11):35-55.
[22]DUAN Z,WANG L,CHEN S,et al.Instance-Based Weighting Filter for Superparent One-Dependence Estimators[J].Know-ledge-Based Systems,2020,203(8):106-115.
[23]CABUZ S,ABREU G.Causal Inference for Multivariate Stochastic Process Prediction[J].Information Sciences,2018,448(12):134-148.
[24]SUN J,TAYLOR D,BOLLT E M.Causal Network Inference by Optimal Causation Entropy[J].SIAM Journal on Applied Dynamical Systems,2015,14(3):73-106.
[25]CHUA H N,SUNG W K,WONG L.Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein—Protein Interactions[J].Bioinformatics,2006,22(13):1623-1630.
[26]SHAIK J S,YEASIN M.A Unified Framework for Finding Differentially Expressed Genes from Microarray Experiments[J].BMC Bioinformatics,2007,8:347.
[27]LI X,RAO S,WANG Y,et al.Gene Mining:A Novel And Power-ful Ensemble Decision Approach to Hunting for Genes Using Microarray Expression Profiling[J].Nucleic Acids Research,2004,32(9):2685-2694.
[28]DIAO Q,HU W,ZHONG H,et al.Disease Gene Explorer:Display Disease Gene Dependency by Combining Bayesian Networks with Clustering[C]//Proceedings of The IEEE Computational Systems Bioinformatics Conference.Stanford,USA,2004:574-575.
[29]ZHANG X W,YAP Y L,WEI D,et al.Molecular Diagnosis of Human Cancer Type by Gene Expression Profiles and Indepen-dent Component Analysis[J].European Journal of Human Genetics,2005,13(12):1303-1311.
[1] SUN Jinyong, WANG Xuechun, CAI Guoyong, SHANG Zhiliang. Open Set Recognition Based on Meta Class Incremental Learning [J]. Computer Science, 2025, 52(5): 187-198.
[2] WU Xin, CHEN Shuwei, JIANG Shipan. Simplification Method for Contradiction Separation Clause in First-order Logic AutomatedTheorem Prover CSE [J]. Computer Science, 2025, 52(5): 235-240.
[3] AN Rui, LU Jin, YANG Jingjing. Deep Clustering Method Based on Dual-branch Wavelet Convolutional Autoencoder and DataAugmentation [J]. Computer Science, 2025, 52(4): 129-137.
[4] WU You, WANG Jing, LI Peipei, HU Xuegang. Semi-supervised Partial Multi-label Feature Selection [J]. Computer Science, 2025, 52(4): 161-168.
[5] LUO Zhengquan, WANG Yunlong, WANG Zilei, SUN Zhenan, ZHANG Kunbo. Study on Active Privacy Protection Method in Metaverse Gaze Communication Based on SplitFederated Learning [J]. Computer Science, 2025, 52(3): 95-103.
[6] NING Limiao, WANG Ziming, LIN Zhicheng, PENG Jian, TANG Huajin. Learning Rule with Precise Spike Timing Based on Direct Feedback Alignment [J]. Computer Science, 2025, 52(3): 260-267.
[7] LI Shao, JIANG Fangting, YANG Xinyan, LIANG Gang. Rumor Detection on Potential Hot Topics with Bi-directional Graph Attention Network [J]. Computer Science, 2025, 52(3): 277-286.
[8] XIE Jiachen, LIU Bo, LIN Weiwei , ZHENG Jianwen. Survey of Federated Incremental Learning [J]. Computer Science, 2025, 52(3): 377-384.
[9] ZHENG Jianwen, LIU Bo, LIN Weiwei, XIE Jiachen. Survey of Communication Efficiency for Federated Learning [J]. Computer Science, 2025, 52(2): 1-7.
[10] XIN Yongjie, CAI Jianghui, HE Yanting, SU Meihong, SHI Chenhui, YANG Haifeng. Multi-view Clustering Based on Cross-structural Feature Selection and Graph Cycle AdaptiveLearning [J]. Computer Science, 2025, 52(2): 145-157.
[11] HOU Hanzhong, ZHANG Chao, LI Deyu. Game-theoretic Rough Group Consensus Decision-making Model Based on Individual-Whole SpanAdjustments and Its Applications [J]. Computer Science, 2025, 52(2): 158-164.
[12] XU Jiucheng, ZHANG Shan, BAI Qing, MA Miaoxian. Attribute Reduction Algorithm Based on Fuzzy Neighborhood Relative Decision Entropy [J]. Computer Science, 2025, 52(2): 165-172.
[13] DU Qiangang, PENG Bo, CHI Mingmin. Remote Sensing Change Detection Based on Contextual Fine-grained Information Restoration [J]. Computer Science, 2025, 52(2): 183-190.
[14] HE Liren, PENG Bo, CHI Mingmin. Unsupervised Multi-class Anomaly Detection Based on Prototype Reverse Distillation [J]. Computer Science, 2025, 52(2): 202-211.
[15] LI Jiahui, ZHANG Mengmeng, CHEN Honghui. Large Language Models Driven Framework for Multi-agent Military Requirement Generation [J]. Computer Science, 2025, 52(1): 65-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!