Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230500006-7.doi: 10.11896/jsjkx.230500006

• Big Data & Data Science • Previous Articles     Next Articles

Cancer Subtype Prediction Based on Similar Network Fusion Algorithm

ZHANG Xiaoxi1, LI Dongxi2   

  1. 1 College of Mathematics,Taiyuan University of Technology,Taiyuan,Shanxi 030060,China
    2 College of Big Data,Taiyuan University of Technology,Taiyuan,Shanxi 030060,China
  • Published:2024-06-06
  • About author:ZHANG Xiaoxi,born in 1997,postgra-duate.Her main research interests include data mining and analysis and so on.
    LI Dongxi,born in 1982,Ph.D,associate professor.His main research interests include data mining and biostatistics.
  • Supported by:
    National Natural Science Foundation of China(11571009),Basic Research Project of Shanxi Province(201901D111086),Key Research and Development Project of Shanxi Province(202102020101004) and Research Support Program of Shanxi Pro-vince for Returned Overseas Students(2022-074).

Abstract: Mining the interaction relationship between genes from gene expression data and construct gene regulatory network is one of the important research topics in bioinformatics.However,the current popular neural network only considers the interaction and association between genes in its architecture,and does not consider the interaction and association between patients.Therefore,a cancer subtype prediction model based on the fusion algorithm of weighted gene similarity network and sample similarity network,namely WGCSS,is proposed in this paper.In this method,the fusion of feature space and sample space information is realized,and the interaction between genes and samples is considered,and the graph convolutional network is used for prediction.Aggregating information in two spaces will lead to a serious oversmoothing problem.Therefore,a residual layer is introduced in the model to alleviate the oversmoothing problem.This method can make the prediction of cancer subtypes more accurate by aggregating the data information in the two spaces.To verify the generalization performance of the method,datasets of invasive breast carcinoma(BRCA),glioblastoma multiforme(GBM),and LUNG(LUNG) are used for analysis,and the resulting high classification accuracy demonstrates the superiority of the method.Survival analysis is also performed on three types of data sets,and it is proved that the method has significant differences in the survival curves of cancer subtypes in three cancer datasets.

Key words: Weighted gene similarity network, Sample similarity network, Residual graph convolutional network, L1 regular, Cancer subtype prediction

CLC Number: 

  • TP399
[1]BERGER M F,MARDIS E R.The emerging clinical relevanceof genomics in cancer medicine[J].Nature Reviews Clinical Oncology,2018,15(6):353-365.
[2]JIA Q,CHU H,JIN Z,et al.High-throughput single-cell se-quencing in cancer research[J].Signal Transduction and Targeted Therapy,2022,7(1):145.
[3]CHEN W,LI J,HUANG S,et al.GCEN:An easy-to-use toolkit for gene co-expression network analysis and lncRNAs annotation[J].Current Issues in Molecular Biology,2022,44(4):1479-1487.
[4]YANG R,DU Y,WANG L,et al.Weighted gene co-expression network analysis identifies CCNA2 as a treatment target of prostate cancer through inhibiting cell cycle[J].Journal of Can-cer,2020,11(5):1203.
[5]ZHANG B,HORVATH S.A general framework for weighted gene co-expression network analysis[J].Statistical Applications in Genetics and Molecular Biology,2005,4(1).
[6]LI C N,SHAO Y H,DENGN Y.Robust L1-norm two-dimensional linear discriminant analysis[J].Neural Networks,2015,65:92-104.
[7]GUO S,GUO D,CHEN L,et al.A L1-regularized feature selection method for local dimension reduction on microarray data[J].Computational Biology and Bhemistry,2017,67:92-101.
[8]LIU B,CHI W,LI X,et al.Evolving the pulmonary nodulesdiagnosis from classical approaches to deep learning-aided decision support:three decades’ development course and future prospect[J].Journal of Cancer Research and Clinical Oncology,2020,146:153-185.
[9]QI LL,WU B T,TANG W,et al.Long-term follow-up of persistent pulmonary pure ground-glass nodules with deep learning-assisted nodule segmentation[J].European Radiology,2020,30:744-755.
[10]MUNIR K,FREZZA F,RIZZIA.Brain tumor segmentationusing 2D-UNET convolutional neural network[J].Deep Lear-ning for Cancer Diagnosis,2021,908:239-248.
[11]XU J,WU P,CHENY,et al.A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data[J].BMC Bioinformatics,2019,20(1):1-11.
[12]PARK K H,BATBAATAR E,PIAO Y,et al.Deep learning feature extraction approach for hematopoietic cancer subtype classification[J].International Journal of Environmental Research and Public Health,2021,18(4):2197.
[13]LOPEZ M M.Deep Learning for Brain Tumor Segmentation[M].University of Colorado Colorado Springs,2017.
[14]MUNIR K,ELAHI H,AYUB A,et al.Cancer diagnosis usingdeep learning:a bibliographic review[J].Cancers,2019,11(9):1235.
[15]CHEN R,YANG L,GOODISON S,et al.Deep-learning ap-proach to identifying cancer subtypes using high-dimensional genomic data[J].Bioinformatics,2020,36(5):1476-1483.
[16]DAI W,YUE W,PENG W,et al.Identifying Cancer Subtypes Using a Residual Graph Convolution Model on a Sample Similarity Network[J].Genes,2022,13(1):65.
[17]BRADLEY P S,MANGASARIAN O L.Feature selection viaconcave minimization and support vector machines[C]//ICML.1998:82-90.
[18]WANG B,MEZLINI A M,DEMIR F,et al.Similarity network fusion for aggregating data types on a genomic scale[J].Nature Methods,2014,11(3):333-337.
[19]XU J,WU P,CHEN Y,et al.A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data[J].BMC Bioinformatics,2019,20(1):1-11.
[20]HUO Y,XIN L,KANG C,et al.SGL-SVM:a novel method for tumor classification via support vector machine with sparse group Lasso[J].Journal of Theoretical Biology,2020,486:110098
[21]ZHONG L,MENG Q,CHEN Y.A Cascade Flexible NeuralForest Model for Cancer Subtypes Classification on Gene Expression Data[J].Computational Intelligence and Neuroscience,2021,2021:6480456.
[22]CHANDRA B,GUPTA M.An efficient statistical feature selection approach for classification of gene expression data[J].Journal of Biomedical Informatics,2011,44(4):529-535.
[23]LV J,PENG Q,CHEN X,et al.A multi-objective heuristic algorithm for gene expression microarray data classification[J].Expert Systems with Applications,2016,59:13-19.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!