计算机科学 ›› 2019, Vol. 46 ›› Issue (9): 310-314.doi: 10.11896/j.issn.1002-137X.2019.09.047

• 交叉与前沿 • 上一篇    下一篇

基于多组学数据识别癌症驱动通路的模型和算法

蔡齐荣1, 吴璟莉1,2   

  1. (广西师范大学计算机科学与信息工程学院 广西 桂林541004)1;
    (广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林541004)2
  • 收稿日期:2018-07-20 出版日期:2019-09-15 发布日期:2019-09-02
  • 通讯作者: 吴璟莉(1978-),女,博士,教授,硕士生导师,CCF会员,主要研究方向为生物信息学、算法设计与分析,E-mail:wjlhappy@mailbox.gxnu.edu.cn
  • 作者简介:蔡齐荣(1993-),男,硕士生,主要研究方向为生物信息学、算法优化;
  • 基金资助:
    国家自然科学基金项目(61762015,61502111,61662007,61763003),广西自然科学基金项目(2015GXNSFAA139288,2016GXNSFAA380192),广西研究生教育创新计划项目(XYCSZ2018078),“八桂学者”工程专项,广西多源信息挖掘与安全重点实验室系统性研究基金项目(14-A-03-02,15-A-03-02),广西科技基地和人才专项(AD16380008)

Model and Algorithm for Identifying Driver Pathways in Cancer by Integrating Multi-omics Data

CAI Qi-rong1, WU Jing-li1,2   

  1. (College of Computer Science and Information Technology,Guangxi Normal University,Guilin,Guangxi 541004,China)1;
    (Guangxi Key Lab of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China)2
  • Received:2018-07-20 Online:2019-09-15 Published:2019-09-02

摘要: 通过整合体细胞突变、拷贝数变异和基因表达等3种组学数据,提出识别癌症驱动通路的改进最大权重子矩阵模型。该模型用通路中基因平均权重调控覆盖度和互斥度,对权重大的基因集覆盖度进行加强,同时放松其高互斥度约束。引入基于贪心算法的重组算子,提出求解该模型的单亲遗传算法PGA-MWS。采用胶质母细胞瘤和卵巢癌数据集对算法PGA-MWS和GA进行实验对比分析。实验结果显示,较GA方法,基于改进模型的PGA-MWS算法能识别出覆盖度高但互斥度不太高的基因集,且其识别的基因集中,许多均参与已知信号通路,并被证实与癌细胞密切相关,同时还能识别几种潜在的候选驱动通路,因此PGA-MWS方法可作为检测癌症驱动通路的一种有效补充。

关键词: 癌症, 多组学数据, 模型, 驱动通路, 算法

Abstract: This paper proposed improved maximum weight submatrix problem model for identifying driver pathways in cancer by integrating somatic mutations,copy number variations,and gene expressions.The model tries to adjust cove-rage and mutual exclusion with the average weight of genes in a pathway,enhances the coverage of the gene set with large weight and relaxes its mutual exclusion constraint.By introducing a greedy based recombination operator,a parthenogenetic algorithm PGA-MWS was presented to solve the model.Experimental comparisons between PGA-MWS and GA were performed on glioblastoma and ovarian cancer datasets.Experimental results show that,compared with GA algorithm,PGA-MWS algorithm based on the improved model can identify gene sets with high coverage and less mutual exclusion.Many of the identified gene sets are involved in known signaling pathways,and have been confirmed to be closely related to cancer cells.Simultaneously,several potential drive pathways can also be discovered.Therefore,the proposed approach may become a useful complementary one for identifying driver pathways.

Key words: Algorithm, Cancer, Driver pathway, Model, Multi-omics data

中图分类号: 

  • TP301
[1]HANAHAN D,WEINBERG R A.The hallmarks of cancer[J].Cell,2000,100(1):57-70.
[2]GREENMAN C,STEPHENS P,SMITH R,et al.Patterns ofsomatic mutation in human cancer genomes [J].European Journal of Cancer Supplements,2008,6(9):153-158.
[3]MCLENDON R,FRIEDMAN A,BIGNER D,et al.Comprehensive genomic characterization defines human glioblastoma genes and core pathways [J].Nature,2008,455(7216):1061-1068.
[4]THE International Cancer Genome Consortium.Internationalnetwork of cancer genome projects [J].Nature,2010,464(7291):993-998.
[5]DING L,GETZ G,WHEELER D A,et al.Somatic mutations affect key pathways in lung adenocarcinoma [J].Nature,2008,455(7216):1069-1075.
[6]DEES N D,ZHANG Q,KANDOTH C,et al.MuSiC:Identifying mutational significance in cancer genomes [J].Genome Research,2012,22(8):1589-1598.
[7]HAHNAHN W C,WEINBERG R A.Modelling the molecular circuitry of cancer[J].Nature Reviews Cancer,2002,2(5):331-341.
[8]BOCA S M,KINZLER K W,VELCULESCU V E,et al.Patientoriented gene set analysis for cancer mutation data [J].Genome Biology,2010,11(11):R112.
[9]ZHANG J,ZHANG S.The Discovery of Mutated Driver Pathways in Cancer:Models and Algorithms [J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2018,15(3):988-998.
[10]VANDING F,UPFAL E,RAPHAEL B J.De novo discovery of mutated driver pathways in cancer [J].Genome Research,2012,22(2):375-385.
[11] YEANG C H,MCCORMICK F,LEVINE A.Combinatorial patterns of somatic gene mutations in cancer [J].Faseb Journal,2008,22(8):2605-2622.
[12]ZHAO J,ZHANG S,WU L Y,et al.Efficient methods for identifying mutated driver pathways in cancer [J].Bioinformatics,2012,28(22):2940-2947.
[13]ZHANG J,ZHANG S,WANG Y,et al.Identification of mutated core cancer modules by integrating somatic mutation,copy number variation,and gene expression data [J].Bmc Systems Biology,2013,7(S2):S4.
[14]LEISERSON M D,BLOKH D,SHARAN R,et al.Simultaneous identification of multiple driver pathways in cancer [J].PLoS Comput Biol,2013,9(5):e1003054.
[15]THE CANCER GENOME ATLAS RESEARCH NETWORK.Integrated genomic analyses of ovarian carcinoma [J].Nature,2011,474(7353):609-615.
[16]KEGG(Release86.1)[OL].https://www.kegg.jp/kegg-bin/show_pathway?query=RB&map=map05200&scale=1.0&show_description=hide.
[17]KEGG(Release86.1)[OL].http://www.kegg.jp/dbget-bin/www_bget?map04115.
[18]WARREN R S,ATREYA C E,NIEDZWIECKI D,et al.Association of TP53 mutational status and gender with survival after adjuvant treatment for stage III colon cancer:results of CALGB 89803 [J].Clinical Cancer Research An Official Journal of the American Association for Cancer Research,2013,19(20):5777-5787.
[19]KEGG(Release86.1)[OL].http://www.genome.jp/dbget-bin/www_bget?pathway:map04151.
[20]MCLENDON R,FRIEDMAN A,BIGNER D,et al.Comprehensive genomic characterization defines human glioblastoma genes and core pathways [J].Nature,2008,455(7216):1061-1068.
[21]KEGG(Release86.1)[OL].http://www.kegg.jp/dbget-bin/www_bget?map04110.
[22]NAKAYAMA N,NAKAYAMA K,SHAMIMA Y,et al.Gene amplification CCNE1 is related to poor survival and potential therapeutic target in ovarian cancer [J].Cancer,2010,116(11):2621.
[23]ENGLER D A,GUPTA S,GROWDON W B,et al.GenomeWide DNA Copy Number Analysis of Serous Type Ovarian Carcinomas Identifies Genetic Markers Predictive of Clinical Outcome [J].Plos One,2012,7(2):e30996.
[24]KEGG(Release86.1)[OL].http://www.kegg.jp/dbget-bin/www_bget?map04261.
[25]JIN Y,MERTENS F,KULLENDORFF C M,et al.Fusion of the Tumor-Suppressor Gene CHEK2 and the Gene for the Regulatory Subunit B of Protein Phosphatase 2 PPP2R2A in Childhood Teratoma [J].Neoplasia,2006,8(5):413-418.
[26]BARATTA M G,SCHINZEL A C,ZWANG Y,et al.An in-tumor genetic screen reveals that the BET bromodomain protein,BRD4,is a potential therapeutic target in ovarian carcinoma [J].Proceedings of the National Academy of Sciences of the United States of America,2015,112(1):232.
[27]KEGG(Release86.1)[OL].http://www.genome.jp/dbget-bin/www_bget?pathway:map04371.
[28]RICCIARDELLI C,OEHLER M K.Diverse molecular pathways in ovarian cancer and their clinical significance [J].Maturitas,2009,62(3):270-275.
[1] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[2] 胡玉姣, 贾庆民, 孙庆爽, 谢人超, 黄韬.
融智算力网络及其功能架构
Functional Architecture to Intelligent Computing Power Network
计算机科学, 2022, 49(9): 249-259. https://doi.org/10.11896/jsjkx.220500222
[3] 刘鑫, 王珺, 宋巧凤, 刘家豪.
一种基于AAE的协同多播主动缓存方案
Collaborative Multicast Proactive Caching Scheme Based on AAE
计算机科学, 2022, 49(9): 260-267. https://doi.org/10.11896/jsjkx.210800019
[4] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[5] 王子凯, 朱健, 张伯钧, 胡凯.
区块链与智能合约并行方法研究与实现
Research and Implementation of Parallel Method in Blockchain and Smart Contract
计算机科学, 2022, 49(9): 312-317. https://doi.org/10.11896/jsjkx.210800102
[6] 姜洋洋, 宋丽华, 邢长友, 张国敏, 曾庆伟.
蜜罐博弈中信念驱动的攻防策略优化机制
Belief Driven Attack and Defense Policy Optimization Mechanism in Honeypot Game
计算机科学, 2022, 49(9): 333-339. https://doi.org/10.11896/jsjkx.220400011
[7] 窦家维.
保护隐私的汉明距离与编辑距离计算及应用
Privacy-preserving Hamming and Edit Distance Computation and Applications
计算机科学, 2022, 49(9): 355-360. https://doi.org/10.11896/jsjkx.220100241
[8] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[9] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[10] 陈俊, 何庆, 李守玉.
基于自适应反馈调节因子的阿基米德优化算法
Archimedes Optimization Algorithm Based on Adaptive Feedback Adjustment Factor
计算机科学, 2022, 49(8): 237-246. https://doi.org/10.11896/jsjkx.210700150
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 唐枫, 冯翔, 虞慧群.
基于自适应知识迁移与资源分配的多任务协同优化算法
Multi-task Cooperative Optimization Algorithm Based on Adaptive Knowledge Transfer andResource Allocation
计算机科学, 2022, 49(7): 254-262. https://doi.org/10.11896/jsjkx.210600184
[14] 张翀宇, 陈彦明, 李炜.
边缘计算中面向数据流的实时任务调度算法
Task Offloading Online Algorithm for Data Stream Edge Computing
计算机科学, 2022, 49(7): 263-270. https://doi.org/10.11896/jsjkx.210300195
[15] 李瑭, 秦小麟, 迟贺宇, 费珂.
面向多无人系统的安全协同模型
Secure Coordination Model for Multiple Unmanned Systems
计算机科学, 2022, 49(7): 332-339. https://doi.org/10.11896/jsjkx.210600107
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!