计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 265-271.doi: 10.11896/jsjkx.201100132
任首朋1, 李劲1, 王静茹1, 岳昆2
REN Shou-peng1, LI Jin1, WANG Jing-ru1, YUE Kun2
摘要: 长链非编码RNA(long non-coding RNA,lncRNA)在各种人类复杂疾病中起着重要作用。采用计算方法推断lncRNA-疾病间的潜在关联关系不仅有助于理解疾病的致病机理,还有助于疾病诊断、预防和治疗。文中提出了一种基于集成回归决策树的lncRNA-疾病关联预测方法。首先,利用已知的lncRNA-疾病关联信息分别构建lncRNA、疾病相似矩阵、lncRNA-疾病关联矩阵;其次,基于lncRNA、疾病相似矩阵、lncRNA-疾病关联矩阵,从不同视角进一步构建lncRNA、疾病特征向量;然后,使用主成分分析方法对lncRNA、疾病特征进行特征提取;最后,使用回归决策树作为预测模型,并进一步采用集成学习的平均策略将多个决策树集成,从而获得最终的预测模型。留一交叉验证实验表明,该方法的预测结果优于现有方法,在3个真实的lncRNA-疾病数据集上AUC值分别达到了0.905 5,0.896 9和0.912 9,与现有方法相比,分别提升了6.46%,5.4%和6.02%。此外,对乳腺癌、肺癌、胃癌3种疾病进行了案例分析,进一步验证了所提方法的准确性和有效性。
中图分类号:
[1]HUTTENHOFER A,SCHATTNER P,POLACEK N.Non-coding RNAs:hope or hype?[J].Trends in Genetics,2005,21(5):289-297. [2]GEISLER S,COLLER J.RNA in unexpected places:Long non- coding RNA functions in diverse cellular contexts[J].Nature Reviews Molecular Cell Biology,2013,14(11):699-712. [3]CHEN X,YAN C C,ZHANG X,et al.Long non-coding RNAsand complex diseases:From experimental results to computational models[J].Briefings in Bioinformatics,2016,18(4):558-576. [4]SUN J,SHI H B,WANG Z Z,et al.Infering novel lncRNA-di-sease asociations based on a random walk model of a lncRNA functional similarity network[J].Molecular BioSystems,2014,10(8):2074-2081. [5]GU C,LI X Y,CAI L J,et al.Global network random walk for predicting potential human lncRNA-disease associations[J].Sci. Rep.,2017,7(1):12442-12453. [6]WEN Y,HAN G,ANH V.Laplacian normalization and bi-random walks on heterogeneous networks for predicting lncRNA-disease associations[J].BMC Systems Biology,2018,12(9),122-131. [7]CHEN X,YAN G Y.Novel human lncRNA-disease associationinference based on lncRNA expression profiles[J].Bioinforma-tics,2013,29(20):2617-2624. [8]CHEN X,YANG C G,LUO C,et al.Constructing lncRNAfunctional similarity network based on lncRNA-disease associations and disease semantic similarity[J].Scientific Reports,2015,5:11338-11350. [9]ZHAO T T,XU J Y,LIU L,et al.Identification of cancer-rela-ted lncRNAs through integrating genome,regulome and transcriptome features[J].Molecular BioSystems,2014,11(1):126-136. [10]XUAN P,PAN S,ZHANG T,et al.Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting lncRNA-Disease Associations[J].Cells,2019,8(9),1012:1-16. [11]WANG M N,YOU Z H,WANG L.LDGRNMF:LncRNA-di-sease associations prediction based on graph regularized non-ne-gative matrix factorization[J].Neurocomputing,2021,424:236-245. [12]LIU J X,CUI Z,GAO Y L,et al.WGRCMF:A Weighted Graph Regularized Collaborative Matrix Factorization Method for Predicting Novel LncRNA-Disease Associations[J].IEEE Journal of Biomedical and Health Informatics,2021,25(1):257-265. [13]WEI H,LIAO Q,LIU B.iLncRNAdis-FB:identify lncRNA-di-sease associations by fusing biological feature blocks through deep neural network[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics/IEEE,ACM,2020(99):1-13. [14]MA Y,GUO X L,SUN Y T,et al.Prediction of Disease Asso-ciated Long Non-Coding RNA Based on HeteSim[J].Journal of Computer Research and Development,2019,56(9):1889-1896. [15]CHEN G,WANG Z Y,WANG D Q,et al.LncRNADisease:A database for long-non-coding RNA-asociated diseases[J].Nuc-leic Acids Research,2012,41(D1):D983-D986. [16]NING S,ZHANG J,PENG W,et al.Lnc2Cancer:A manualycurated database of experimentaly supported lncRNAs asociated with various human cancers[J].Nucleic Acids Research,2015,44(D1):D980-D985. [17]PENG H,LAN C W,LIU Y S,et al.Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes[J].Oncotarget,2017,8(45):78901-78916. [18]BREIMAN L,FRIEDMAN J,STONE C J,et al.Classification and regression trees[M].CRC Press,1984:1-18. [19]FRIENDENSON B.The BRCA1/2 pathway prevents hemato-logic cancers in addition to breast and ovarian cancers[J].BMC Cancer,2007,7(1):152-162. |
[1] | 张源, 康乐, 宫朝辉, 张志鸿. 基于Bi-LSTM的期货市场关联交易行为检测方法 Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM 计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304 |
[2] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[3] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[4] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[5] | 林夕, 陈孜卓, 王中卿. 基于不平衡数据与集成学习的属性级情感分类 Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning 计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205 |
[6] | 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩. 融合Bert和图卷积的深度集成学习软件需求分类 Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution 计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065 |
[7] | 王宇飞, 陈文. 基于DECORATE集成学习与置信度评估的Tri-training算法 Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment 计算机科学, 2022, 49(6): 127-133. https://doi.org/10.11896/jsjkx.211100043 |
[8] | 高元浩, 罗晓清, 张战成. 基于特征分离的红外与可见光图像融合算法 Infrared and Visible Image Fusion Based on Feature Separation 计算机科学, 2022, 49(5): 58-63. https://doi.org/10.11896/jsjkx.210200148 |
[9] | 韩红旗, 冉亚鑫, 张运良, 桂婕, 高雄, 易梦琳. 基于共同子空间分类学习的跨媒体检索研究 Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning 计算机科学, 2022, 49(5): 33-42. https://doi.org/10.11896/jsjkx.210200157 |
[10] | 左杰格, 柳晓鸣, 蔡兵. 基于图像分块与特征融合的户外图像天气识别 Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion 计算机科学, 2022, 49(3): 197-203. https://doi.org/10.11896/jsjkx.201200263 |
[11] | 陈伟, 李杭, 李维华. 核小体定位预测的集成学习方法 Ensemble Learning Method for Nucleosome Localization Prediction 计算机科学, 2022, 49(2): 285-291. https://doi.org/10.11896/jsjkx.201100195 |
[12] | 刘振宇, 宋晓莹. 一种可用于分类型属性数据的多变量回归森林 Multivariate Regression Forest for Categorical Attribute Data 计算机科学, 2022, 49(1): 108-114. https://doi.org/10.11896/jsjkx.201200189 |
[13] | 张师鹏, 李永忠. 基于降噪自编码器和三支决策的入侵检测方法 Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions 计算机科学, 2021, 48(9): 345-351. https://doi.org/10.11896/jsjkx.200500059 |
[14] | 周新民, 胡宜桂, 刘文洁, 孙荣俊. 基于多模态多层级数据融合方法的城市功能识别研究 Research on Urban Function Recognition Based on Multi-modal and Multi-level Data Fusion Method 计算机科学, 2021, 48(9): 50-58. https://doi.org/10.11896/jsjkx.210500220 |
[15] | 冯霞, 胡志毅, 刘才华. 跨模态检索研究进展综述 Survey of Research Progress on Cross-modal Retrieval 计算机科学, 2021, 48(8): 13-23. https://doi.org/10.11896/jsjkx.200800165 |
|