计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 114-120.doi: 10.11896/jsjkx.200900169

• 人工智能* 上一篇    下一篇

基于堆叠自动编码器的miRNA-疾病关联预测方法

刘丹, 赵森, 颜志良, 赵静, 王会青   

  1. 太原理工大学信息与计算机学院 太原030606
  • 收稿日期:2020-09-23 修回日期:2021-01-23 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 王会青(1013208257@qq.com)
  • 作者简介:869095763@qq.com
  • 基金资助:
    山西省重点研发计划项目(201903D121151);山西省研究生教育改革课题(2019JG020153)

miRNA-disease Association Prediction Model Based on Stacked Autoencoder

LIU Dan, ZHAO Sen, YAN Zhi-liang, ZHAO Jing, WANG Hui-qing   

  1. College of Information and Computer,Taiyuan University of Technology,Taiyuan 030606,China
  • Received:2020-09-23 Revised:2021-01-23 Online:2021-10-15 Published:2021-10-18
  • About author:LIU Dan,born in 1995,is a member of China Computer Federation.Her main research interests include intelligent information processing and bioinforma-tics.
    WANG Hui-qing,born in 1978,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include intelligent information processing and bioinforma-tics.
  • Supported by:
    Key Research and Development Plan of Shanxi Province(201903D121151) and Graduate Education Reform of Shanxi Province(2019JG020153).

摘要: 作为一类小的非编码RNA,miRNA的异常调控与人类疾病的发生和发展密切相关,研究miRNA与疾病的关联对于了解人类疾病致病机制具有重要意义。机器学习方法被广泛应用于miRNA-疾病关联预测,然而现有方法仅仅考虑了miRNA与疾病相似性网络信息,忽略了相似性网络的拓扑结构。因此,文中提出基于堆叠自动编码器的miRNA-疾病关联预测模型SAEMDA,该模型采用重启随机游走获取miRNA与疾病相似性网络的拓扑结构特征,用堆叠自动编码器提取miRNA与疾病的抽象低维特征,将得到的低维特征输入深度神经网络进行miRNA-疾病关联预测。SAEMDA模型在5折交叉验证中取得了较好的结果,并在结肠癌和肺癌两个案例中进行了验证。在结肠癌的案例中,此模型预测的前50个miRNA-疾病关联中的45个miRNA在数据库中得到了验证;在肺癌的案例中,排名前50的miRNA均在数据库中得到了验证。

关键词: miRNA-疾病关联, 堆叠自动编码器, 拓扑结构, 相似性网络, 重启随机游走

Abstract: As a group of small non-coding RNA,the abnormal regulation of miRNA is closely related to the occurrence and deve-lopment of human diseases.The study on the associations between miRNA and disease is important for understanding the pathogenic mechanism of human diseases.Machine learning methods are widely used to predict miRNA-disease associations.However,existing methods only consider the information of miRNA and disease similarity networks,ignoring the topology structure of the similarity networks.Therefore,SAEMDA model based on stacked autoencoder is proposed in this paper,it gets the topological structure features of miRNA and disease similarity networks by restart random walk,obtains the abstract low dimensional features of miRNA and disease by stacked autoencoder,and the low dimensional features are input into deep neural network for miRNA-disease associations prediction.SAEMDA model has achieved great results in 5-fold cross-validation,and it has been validated in cases of colon cancer and lung cancer additionally.As for colon cancer,45 of the top 50 miRNA-disease associations predicted by this model are verified in the database;and in the cases of lung cancer,all the top 50 miRNAs are verified in the database.

Key words: miRNA-disease associations, Random walk, Similarity networks, Stacked autoencoder, Topological structure

中图分类号: 

  • TP391
[1]ZHANG J X,SONG W,CHEN Z H,et al.Prognostic and predictive value of a microRNA signature in stage II colon cancer:a microRNA expression analysis[J].The Lancet Oncology,2013,14(13):1295-1306.
[2]SU Y,DENG M F,XIONG W,et al.MicroRNA-26a/death-associated protein kinase 1 signaling induces synucleinopathy and dopaminergic neuron degeneration in Parkinson's disease[J].Biological Psychiatry,2019,85(9):769-781.
[3]YOU Z,HUANG Z A,ZHU Z X,et al.PBMDA:A novel and effective path-based computational model for miRNA-disease association prediction[J].PLoS Computational Biology,2017,13(3):e1005455.
[4]CHEN X,LIU M X,YAN G Y.RWRMDA:predicting novel humanmicroRNA-disease associations[J].Molecular Biosystems,2012,8(10):2792-2798.
[5]CHEN X,WANG C C,YIN J,et al.Novel human miRNA-disease association inference based on random forest[J].Mole-cular Therapy-Nucleic Acids,2018,13:568-579.
[6]YAO D,ZHAN X,KWOH C K.An improved random forest-based computational model for predicting novel miRNA-disease associations[J].BMC Bioinformatics,2019,20(1):1-14.
[7]ZHANG L,CHEN X,YIN J.Prediction of Potential miRNA-Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder[J].Cells,2019,8(9):1040.
[8]PENG J,HUI W,LI Q,et al.A learning-based framework for miRNA-disease association identification using neural networks[J].Bioinformatics,2019,35(21):4364-4371.
[9]CHEN X,GONG Y,ZHANG D H,et al.DRMDA:deep representations-based miRNA-disease association prediction[J].Journal of Cellular and Molecular Medicine,2017,22(1):472-485.
[10]WANG L,XU T,SONG C D.Prediction algorithm of miRNA and disease correlation based on deep learning[J].Acta Electronica Sinica,2020,447(5):40-47.
[11]KÖHLER S,BAUER S,HORN D,et al.Walking the interactome for prioritization of candidate disease genes[J].The Ameri-can Journal of Human Genetics,2008,82(4):949-958.
[12]TANG J Q,WU J L,LIAO Y X,et al.Protein function prediction based on double weighted voting[J].Computer Science,2019,46(4):222-227.
[13]WANG H,LE Z C,GONG X,et al.Summary of link prediction methods based on feature classification[J].Computer Science,2020,47(8):302-312.
[14]JIANG L,DING Y,TANG J,et al.MDA-SKF:similarity kernel fusion for accurately discovering miRNA-disease association[J].Frontiers in Genetics,2018,9:618.
[15]LI Y,QIU C,TU J,et al.HMDD v2.0:a database for experimentally supported human microRNA and disease associations[J].Nucleic Acids Research,2014,42(D1):D1070-D1074.
[16]YANG Z,REN F,LIU C,et al.dbDEMC:a database of differentially expressed miRNAs in human cancers[J].BMC Geno-mics,2010,11(4):1-8.
[17]WANG D,WANG J,LU M,et al.Inferring the human micro-RNA functional similarity and functional network based on microRNA-associated diseases[J].Bioinformatics,2010,26(13):1644-1650.
[18]RUAN L,XIONG Y.Research on Functional Similarity of mi-RNA Based on Network Representation Learning[J].Computer Engineering,2019,45(2):154-159.
[19]ZHU Y X,FENG W,GUO X H.Application progress of deep learning method in brain image of Alzheimer's disease[J].Me-dical Review,2019,25(18):3562-3566.
[20]LI Y N,HU Y J,GAN W,et al.Survey on Target Site Prediction of Human miRNA Based on Deep Learning[J].Computer Science,2021,48(1):209-216.
[21]XUAN P,DONG Y,GUO Y,et al.Dual convolutional neural network based method for predicting disease-related miRNAs[J].International Journal of Molecular Sciences,2018,19(12):3732.
[22]YU Y,NANGIA-MAKKER P,FARHANA L,et al.miR-21 and miR-145 cooperation in regulation of colon cancer stem cells[J].Molecular Cancer,2015,14(1):1-11.
[23]BRAY F,FERLAY J,SOERJOMATARAM I,et al.Globalcancer statistics 2018:GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J].CA:a Cancer Journal for Clinicians,2018,68(6):394-424.
[1] 袁榕, 宋玉蓉, 孟繁荣.
一种基于加权网络拓扑权重的链路预测方法
Link Prediction Method Based on Weighted Network Topology Weight
计算机科学, 2020, 47(5): 265-270. https://doi.org/10.11896/jsjkx.190600031
[2] 刘晓东, 魏海平, 曹宇.
考虑网络拓扑结构变化的SIRS模型的建立与稳定性分析
Modeling and Stability Analysis for SIRS Model with Network Topology Changes
计算机科学, 2019, 46(6A): 375-379.
[3] 封云飞, 陈红梅.
基于拓扑结构的密度峰值重叠社区发现算法
Topological Structure Based Density Peak Algorithm for Overlapping Community Detection
计算机科学, 2019, 46(10): 39-48. https://doi.org/10.11896/jsjkx.180901644
[4] 郭利娟, 吕晓琳.
线性拓扑结构的乐观认证邮件
Optimistic Certified Email for Line Topology
计算机科学, 2018, 45(8): 156-159. https://doi.org/10.11896/j.issn.1002-137X.2018.08.028
[5] 洪汉玉,马尔威,黄丽坤.
基于重新检测过程的三维细化算法的改进
Improvement of 3D Thinning Algorithm Based on Re-checking Procedure
计算机科学, 2018, 45(5): 266-272. https://doi.org/10.11896/j.issn.1002-137X.2018.05.046
[6] 万莹, 洪玫, 陈宇星, 王帅, 樊哲宁.
基于时间、空间和规则的无线网络告警关联方法
Wireless Network Alarm Correlation Based on Time,Space and Rules
计算机科学, 2018, 45(11A): 287-291.
[7] 张光兰, 杨秋辉, 程雪梅, 姜科, 王帅, 谭武坤.
序列模式挖掘在通信网络告警预测中的应用
Application of Sequence Pattern Mining in Communication Network Alarm Prediction
计算机科学, 2018, 45(11A): 535-538.
[8] 焦重阳,周清雷,张文宁.
混合拓扑结构的粒子群算法及其在测试数据生成中的应用研究
MPSO and Its Application in Test Data Automatic Generation
计算机科学, 2017, 44(12): 249-254. https://doi.org/10.11896/j.issn.1002-137X.2017.12.045
[9] 徐静,刘宴涛,夏桂阳,Y asser MORGAN.
基于网络编码的拓扑推断研究综述
Network Coding Based Topology Inference:A Survey
计算机科学, 2016, 43(Z6): 242-248. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.059
[10] 汤颖,钟南江,范菁.
一种结合用户评分信息的改进好友推荐算法
Improved Friends Recommendation Algorithm Combining with User Rating Information
计算机科学, 2016, 43(9): 111-115. https://doi.org/10.11896/j.issn.1002-137X.2016.09.021
[11] 徐喜荣,黄亚真,张思佳,董学智.
广义Kautz有向图GK(3,n)的反馈数的界
Feedback Numbers of Generalized Kautz Digraphs GK(3,n)
计算机科学, 2016, 43(5): 13-21. https://doi.org/10.11896/j.issn.1002-137X.2016.05.003
[12] 徐潜,谭成翔.
基于动态权限集的Android强制访问控制模型
Mandatory Access Control Model for Android Based on Dynamic Privilege Set
计算机科学, 2015, 42(11): 191-196. https://doi.org/10.11896/j.issn.1002-137X.2015.11.040
[13] 范青刚,叶雪梅,蔡艳宁.
Zigbee路由协议在车载自组网监控系统中的性能研究
Performance Evaluation of Zigbee Routing Protocol in VANET Monitoring System
计算机科学, 2014, 41(Z6): 326-328.
[14] 李洪兵,熊庆宇,石为人.
无线传感器网络非均匀等级分簇拓扑结构研究
Study on Topology with Non-uniform Hierarchical Clustering for Wireless Sensor Networks
计算机科学, 2013, 40(2): 49-52.
[15] 曹洪新,李光顺,吴俊华.
基于一种新网络拓扑结构的低功耗研究
Low Power Research Based on a New NoC Topology Architecture
计算机科学, 2012, 39(Z11): 327-330.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!