计算机科学 ›› 2018, Vol. 45 ›› Issue (7): 230-236.doi: 10.11896/j.issn.1002-137X.2018.07.040
所属专题: 医学图像
郭炳1,郑文萍2,韩素青1
GUO Bing1,ZHENG Wen-ping2,HAN Su-qing1
摘要: 大型癌症基因组项目(TCGA,ICGC等)产生了大量的癌症组学数据,使人们深入研究癌症变为可能,其中寻找引发癌症的相关突变基因是一个重要挑战。在癌细胞中,基因变异可分为两类:一类是可导致癌症发生的驱动突变(driver mutation),另一类是对癌症发生扩散没有影响的乘客突变(passenger mutation)。识别癌症驱动基因有利于理解癌症发病原理和发展进程以及研发癌症药物或进行靶向治疗,是生物信息学中的重要问题。文中提出一种基于突变基因网络的癌症驱动通路识别算法GNDP,对癌症病人的体细胞突变数据进行分析。该算法定义了非重叠平衡度来度量基因对的位于同一驱动通路的可能性;根据基因对的非重叠平衡度、互斥和覆盖度,构建基因互斥网络,很大程度上减少了网络边数,提高了计算效率;在所构造的基因互斥网络中将查找到的极大团作为潜在驱动通路基因集合;用覆盖度和互斥度对潜在驱动通路基因集合进行筛选,得到其极大权重子团,并将其作为识别出的驱动通路。分别在模拟数据、肺腺癌以及多形性成胶质细胞瘤突变数据上对GNDP算法进行有效性验证,并将其与经典驱动通路识别算法Dendrix和Multi-Dendrix进行实验对比。结果表明,GNDP不需要指定驱动通路的基因个数,能在模拟数据上准确检测出所有人工设置的驱动通路;针对肺腺癌和多形性成胶质细胞瘤突变数据,GNDP在不需要任何先验知识的情况下达到较高的识别准确率,能高效地识别出主要驱动通路,其结果优于对比算法。
中图分类号:
[1]HANAHAN D,WEINBERG R A.The hallmarks of cancer[J].Cell,2000,100(1):57-70. [2]FIDLER I J.The pathogenesis of cancer metastasis:the ‘seed and soil’ hypothesis revisited[J].Nature Reviews Cancer,2003,3(6):453-458. [3]MCLENDON R,FRIEDMAN A,BIGNER D,et al.Comprehensive genomic characterization defines human glioblastoma genes and core pathways. [J].Nature,2008,455(7216):1061-1068. [4]The Cancer Genome Atlas Research Network.Integratedgenomic analyses of ovarian carcinoma[J].Nature,2011,474,609-615. [5]MEYERSON M,GABRIEL S,GETZ G.Advances in under-standing cancer genomes through second-generation sequencing[J].Nature Reviews Genetics,2010,11(10):685-696. [6]HUDSON T J,ANDERSON W,ARTEZ A,et al..International network of cancer genome projects[J].Nature,2010,464(7291):993-998. [7]MARDIS E R,WILSON R K.Cancer genome sequencing:a review[J].Human Molecular Genetics,2009,18(R2):R163. [8]BARRETINA J,CAPONIGRO G,STRANSKY N,et al.TheCancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity[J].Nature,2012,483(7391):603-607. [9]MENG J,LI R,HAO H.Gene microarray data classificationbased on intersecting neighborhood rough set[J].Computer Scien-ce,2015,42(6):37-40.(in Chinese) 孟军,李锐,郝涵.基于相交邻域粗糙集的基因微阵列数据分类[J].计算机科学,2015,42(6):37-40. [10]OVERDEVEST J B,THEODORESCU D,LEE J K.Utilizing the Molecular Gateway:The path to personalized cancermana-gement[J].Clinical Chemistry,2009,55(4):684-697. [11]SWANTON C,CALDAS C.Molecular classification of solidtumours:towards pathway-driven therapeutics[J].British Journal of Cancer,2009,100(10):1517-1522. [12]ZHENG C H,YANG W,CHONG Y W,et al.Identification of mutated driver pathways in cancer using a multi-objective optimization model[J].Computers in Biology & Medicine,2016,72:22-29. [13]GREENMAN C,STEPHENS P,SMITH R,et al.Patternsof somatic mutation in human cancer genomes[J].Nature,2007,446(7132):153-158. [14]BATCHELOR T T,BETENSKY R A,ESPOSITO J M,et al.Age-dependent prognostic effects of genetic alterations in glioblastoma[J].Clinical Cancer Research:An Official Journal of the American Association for Cancer Research,2004,10(1):228-233. [15]VOGELSTEIN B,KINZLER K W.Cancer genes and the pathways they control[J].Nature Medicine,2004,10(8):789-799. [16]XIE Q Q,LI D F,ZHANG W.Two novel tree structure-based methods for gene selection[J].Computer Science,2015,42(7):250-253.(in Chinese) 谢倩倩,李订芳,章文.两种基于树结构的基因选择算法[J].计算机科学,2015,42(7):250-253. [17]VANDIN F,UPFAL E,RAPHAEL B J.De novo discovery of mutated driver pathways in cancer[J].Genome Research,2012,22(2):375-85. [18]DING L,GETZ G,WHEELER D A,et al.Somatic mutations affect key pathways in lung adenocarcinoma[J].Nature,2008,455(7216):1069-1075. [19]LAWRENCE M S,STOJANOV P,POLAK P,et al.Mutational heterogeneity in cancer and the search for new cancer genes[J].Nature,2013,499(7457):214-218. [20]BOCA S M,KINZLER K W,VELCULESCU V E,et al.Pa-tient-oriented gene set analysis for cancer mutation data[J].Genome Biology,2010,11(11):R112. [21]EFRONI S,BENHAMO R,EDMONSON M,et al.Detectingcancer gene networks characterized by recurrent genomic alterations in a population[J].Plos One,2011,6(1):e14437. [22]HAHN W C,WEINBERG R A.Modelling the molecular circuitry of cancer[J].Nature Reviews Cancer,2002,2(5):331-341. [23]VASKE C J,BENZ S C,SANBORN J Z,et al.Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM[J].Bioinformatics,2010,26(12):i237-i245. [24]BASHASHATI A,HAFFARI G,DING J,et al.DriverNet:uncovering the impact of somatic driver mutations on transcriptional networks in cancer[J].Genome Biology,2012,13(12):R124. [25]ZENG X X,LIAO Y L,LIU Y S,et al.Prediction and validation of disease genes using HeteSim Scores[J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2016,PP(99):1. [26]CHEN X,YAN G Y,REN W,et al.Modularized random walk with restart for candidate disease genes prioritization[C]∥The Third International Symposium on Optimization and SystemsBio-logy(OSB’09).2009:353-360. [27]CHEN X,HUANG L,XIE D,et al.EGBMMDA:Extreme gradient boosting machine for miRNA-disease association prediction[J].Cell Death & Disease,2018,9(1):3. [28]MILLER C A,SETTLE S H,SULMAN E P,et al.Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors[J].BMC Medical Genomics,2011,4(1):34. [29]ZHAO J F,ZHANG S H,WU L Y,et al.Efficient methods for identifying mutated driver pathways in cancer[J].Bioinforma-tics,2012,28(22):2940-2947. [30]LEISERSON M D M,BLOKH D,SHARAN R,et al.Simulta-neous identification of multiple driver pathways in cancer[J].Plos Computational Biology,2013,9(5):e1003054. [31]WU H,GAO L,LI F,et al.Identifying overlapping mutateddriver pathways by constructing gene networks in cancer[J].Bmc Bioinformatics,2015,16(S5):S3. [32]WU H.Algorithm for detecting driver pathways in cancer based on mutated gene networks[J/OL].Chinese Journal of Compu-ter,(2017-02-24).http://kns.cnki.net/kcms/detail/11.1826.TP.20170224.1408.101.html.(in Chinese) 吴昊.基于突变基因网络的致癌驱动通路检测算法[J/OL].计算机学报,(2017-02-24).http://kns.cnki.net/kcms/detail/11.1826.TP.20170224.1408.101.html. [33]SZCZUREK E,BEERENWINKEL N.Modeling mutual exclusivity of cancer mutations[J].Plos Computational Biology,2014,10(3):e1003503. |
[1] | 蔡齐荣, 吴璟莉. 基于多组学数据识别癌症驱动通路的模型和算法 Model and Algorithm for Identifying Driver Pathways in Cancer by Integrating Multi-omics Data 计算机科学, 2019, 46(9): 310-314. https://doi.org/10.11896/j.issn.1002-137X.2019.09.047 |
[2] | 朱金彬,武继刚,隋秀峰. 基于极大团的边缘云节点聚合算法 Edge Cloud Clustering Algorithm Based on Maximal Clique 计算机科学, 2018, 45(4): 60-65. https://doi.org/10.11896/j.issn.1002-137X.2018.04.008 |
[3] | 黄治国,李娜. 基于分辨函数的极大团搜索算法 Discernibility Function-based Algorithm for Finding All Maximal Cliques 计算机科学, 2014, 41(4): 248-251. |
[4] | 王宁,杨扬,巩华荣,赵耀培,孟坤. 一种基于极大团的关键时间段挖掘方法 Mining Method of Key Time Interval Based on Maximum Clique 计算机科学, 2012, 39(6): 166-169. |
|