计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 209-216.doi: 10.11896/jsjkx.191200111
李亚男, 胡宇佳, 甘伟, 朱敏
LI Ya-nan, HU Yu-jia, GAN Wei, ZHU Min
摘要: MicroRNAs(miRNAs)是一类长约22~23碱基(nt)的单链非编码RNA,在生物进化方面有着重要意义。成熟的miRNA会通过其种子序列(5'第2-8位核苷酸)与message RNAs(mRNAs)的3'UTR区域靶位点进行完全或不完全配对,实现切割mRNA及抑制mRNA翻译等功能。由于miRNA结合mRNA靶位点的机制仍未明确,因此预测miRNA靶位点的工作一直是miRNA研究领域的一大挑战和难题。实验方法虽然准确,但耗时长且昂贵。在生物信息领域,基于规则匹配的常规计算方法虽然能进行靶位点的预测,但存在着准确率偏低的问题。随着深度学习的兴起及实验验证数据及具体靶位点信息的丰富,基于深度学习的方法成为了miRNA靶位点预测领域的研究热点。首先介绍了常用的miRNA预测数据集、预测类型和常见特征;之后对预测研究中常用的深度学习模型进行阐述;接着介绍了常规的预测方法及基于深度学习的预测方法,并对这些方法进行了分类总结和性能的对比分析;最后对使用深度学习的预测工作当前存在的问题及未来的发展进行了探讨。
中图分类号:
[1] KIM,NARRY V.MicroRNA biogenesis:coordinated croppingand dicing[J].Nature Reviews Molecular Cell Biology,2005,6(5):376-385. [2] IBÁÑEZ-VENTOSO C,VORA M,DRISCOLL M.Sequence relationships among C.elegans,D.melanogaster and human microRNAs highlight the extensive conservation of microRNAs in biology[J].PloS one,2008,3(7):e2818-e2818. [3] KOTA J,CHIVUKULA R R,O'DONNELL K A,et al.Therapeutic microRNA delivery suppresses tumorigenesis in a murine liver cancer model[J].Cell,2009,137(6):1005-1017. [4] MA L,REINHARDT F,PAN E,et al.Therapeutic silencing of miR-10b inhibits metastasis in a mouse mammary tumor model[J].Nature Biotechnology,2010,28(4):341-347. [5] MA L,TERUYA-FELDSTEIN J,WEINBERG R A.Tumourinvasion and metastasis initiated by microRNA-10b in breast cancer[J].Nature,2007,449(7163):682-688. [6] THOMAS M,LIEBERMAN J,LAL A.Desperately seeking microRNA targets[J].Nature Structural & Molecular Biology,2010,17(10):1169-1174. [7] BARTEL D P.MicroRNAs:genomics,biogenesis,mechanism,and function[J].Cell,2004,116(2):281-297. [8] HUANG J C,BABAK T,CORSON T W,et al.Using expression profiling data to identify human microRNA targets[J].Nature Methods,2007,4(12):1045-1049. [9] ALIPANAHI B,DELONG A,WEIRAUCH M T,et al.Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning[J].Nature Biotechnology,2015,33(8):831-838. [10] ESTEVA A,KUPREL B,NOVOA R A,et al.Dermatologist-level classification of skin cancer with deep neural networks[J].Nature,2017,542(7639):115-118. [11] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444. [12] ZHOU J,TROYANSKAYA O G.Predicting effects of noncoding variants with deep learning-based sequence model[J].Nature Methods,2015,12(10):931-934. [13] KARAGKOUNI D,PARASKEVOPOULOU M D,CHATZOPOULOS S,et al.DIANA-TarBase v8:a decade-long collection of experimentally supported miRNA-gene interactions[J].Nucleic Acids Research,2017,46(D1):D239-D245. [14] CHOU C H,SHRESTHA S,YANG C D,et al.miRTarBase update 2018:a resource for experimentally validated microRNA-target interactions[J].Nucleic Acids Research,2017,46(D1):D296-D302. [15] HELWAK A,KUDLA G,DUDNAKOVA T,et al.Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding[J].Cell,2013,153(3):654-665. [16] BRENNECKE J,STARK A,RUSSELL R B,et al.Principles of microRNA-target recognition[J].PLoS Biology,2005,3(3):e85. [17] LORENZ R,BERNHART S H,ZUSIEDERDISSEN C H,et al.ViennaRNA Package 2.0[J].Algorithms for Molecular Bio-logy,2011,6(1):26. [18] KENT W J,SUGNET C W,FUREY T S,et al.The human genome browser at UCSC[J].Genome Research,2002,12(6):996-1006. [19] ALTSCHUL S F,GISH W,MILLER W,et al.Basic local alignment search tool[J].Journal of Molecular Biology,1990,215(3):403-410. [20] JOHN B,ENRIGHT A J,ARAVIN A,et al.Human microRNA targets[J].PLoS Biology,2004,2(11):e363. [21] MARAGKAKIS M,ALEXIOU P,PAPADOPOULOS G L,et al.Accurate microRNA target prediction correlates with protein repression levels[J].BMC Bioinformatics,2009,10(1):295. [22] MENOR M,CHING T,ZHU X,et al.mirMark:a site-level and UTR-level classifier for miRNA target prediction[J].Genome Biology,2014,15(10):500. [23] FISHER R A,YATES F.Statistical tables for biological,agricultural and medical research[M].London:Oliver and Boyd Ltd,1943. [24] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507. [25] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105. [26] COLLOBERT R,WESTON J.A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning.ACM,2008:160-167. [27] PAN X,SHEN H B.Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks[J].Bioinformatics,2018,34(20):3427-3436. [28] CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297. [29] BREIMAN L.Random forests[J].Machine Learning,2001,45(1):5-32. [30] LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324. [31] WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1989,1(2):270-280. [32] VINCENT P,LAROCHELLE H,LAJOIE I,et al.Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion[J].Journal of Machine Learning Research,2010,11(12):3371-3408. [33] ZHANG L,CHEN X,YIN J.Prediction of Potential miRNA-Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder[J].Cells,2019,8(9):1040. [34] HENDERSON J,LY V,OLICHWIER S,et al.Accurate prediction of boundaries of high resolution topologically associated domains(TADs) in fruit flies using deep learning[J].Nucleic Acids Research,2019,47(13):e78-e78. [35] YANG Y,ZHOU M,FANG Q,et al.AnnoFly:annotating Drosophila embryonic images based on an attention-enhanced RNN model[J].Bioinformatics,2019,35(16):2834-2842. [36] NG A.Sparse autoencoder[J].CS294A Lecture Notes,2011,72(2011):1-19. [37] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [38] CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014. [39] HE K,ZHANG X,REN S,et al.Deep residual learning for ima-ge recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [40] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958. [41] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015. [42] ABADI M,BARHAM P,CHEN J,et al.Tensorflow:A system for large-scale machine learning[C]//12th {USENIX} Symposium on Operating Systems Design and Implementation(OSDI 16).2016:265-283. [43] PASZKE A,GROSS S,MASSA F,et al.Pytorch:An imperative style,high-performance deep learning library[C]//Advances in neural information processing systems.2019:8026-8037. [44] TEAM T T D,AL-RFOU R,ALAIN G,et al.Theano:A Python framework for fast computation of mathematical expressions[J].arXiv:1605.02688,2016. [45] FAN X,KURGAN L.Comprehensive overview and assessment of computational prediction of microRNA targets in animals[J].Briefings in Bioinformatics,2014,16(5):780-794. [46] LEWIS B P,SHIH I,JONES-RHOADES M W,et al.Prediction of mammalian microRNA targets[J].Cell,2003,115(7):787-798. [47] KRÜGER J,REHMSMEIER M.RNAhybrid:microRNA target prediction easy,fast and flexible[J].Nucleic Acids Research,2006,34(suppl 2):W451-W454. [48] KERTESZ M,IOVINO N,UNNERSTALL U,et al.The role of site accessibility in microRNA target recognition[J].Nature Genetics,2007,39(10):1278-1284. [49] STURM M,HACKENBERG M,LANGENBERGER D,et al.TargetSpy:a supervised machine learning approach for microRNA target prediction[J].BMC Bioinformatics,2010,11(1):292. [50] BANDYOPADHYAY S,MITRA R.TargetMiner:microRNAtarget prediction with systematic identification of tissue-specific negative examples[J].Bioinformatics,2009,25(20):2625-2631. [51] DING J,LI X,HU H.TarPmiR:a new approach for microRNA target site prediction[J].Bioinformatics,2016,32(18):2768-2775. [52] CHENG S,GUO M,WANG C,et al.MiRTDL:A Deep Learning Approach for miRNA Target Prediction[J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2016,13(6):1161-1169. [53] LEE B,BAEK J,PARK S,et al.deepTarget:end-to-end learning framework for microRNA target prediction using deep recurrent neural networks[C]//Proceedings of the 7th ACM International Conference on Bioinformatics,Computational Biology,and Health Informatics.2016:434-442. [54] WEN M,CONG P,ZHANG Z,et al.DeepMirTar:a deep-learning approach for predicting human miRNA targets[J].Bioinformatics,2018,34(22):3781-3787. [55] PLA A,ZHONG X,RAYNER S.miRAW:A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts[J].PLoS Computational Biology,2018,14(7):e1006185. [56] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[C]//Advances in Neural Information Processing Systems.2014:2672-2680. [57] GUPTA A,ZOU J.Feedback GAN for DNA optimizes protein functions[J].Nature Machine Intelligence,2019,1(2):105-111. [58] LIU Q,LV H,JIANG R.hicGAN infers super resolution Hi-C data with generative adversarial networks[J].Bioinformatics,2019,35(14):i99-i107. [59] TARGONSKI C,SHEALY B T,SMITH M C,et al.Cellular State Transformations using Generative Adversarial Networks[J].arXiv:1907.00118,2019. [60] YU L,ZHANG W,WANG J,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017. [61] MOORE M J,SCHEEL T K H,LUNA J M,et al.miRNA-target chimeras reveal miRNA 3’-end pairing as a major determinant of Argonaute target specificity[J].Nature Communications,2015,6(1):1-17. [62] KLUM S M,CHANDRADOSS S D,SCHIRLE N T,et al.Helix-7in Argonaute2 shapes the microRNA seed region for rapid target recognition[J].The EMBO Journal,2018,37(1):75-88. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[3] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[4] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[5] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[6] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[7] | 彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093 |
[8] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[9] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[10] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[11] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[12] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[13] | 吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039 |
[14] | 杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行. 基于步态分类辅助的虚拟IMU的行人导航方法 Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification 计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148 |
[15] | 王杉, 徐楚怡, 师春香, 张瑛. 基于CNN-LSTM的卫星云图云分类方法研究 Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM 计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177 |
|