计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 164-172.doi: 10.11896/jsjkx.211200186
王晓飞, 樊学强, 李章维
WANG Xiaofei, FAN Xueqiang, LI Zhangwei
摘要: RNA碱基相互作用对维持其三维结构的稳定具有重要作用,准确地预测碱基相互作用可以辅助RNA三维结构的预测。然而,用于预测RNA碱基相互作用的数据量少,导致模型未能充分地学习到数据的特征分布,以及数据存在的特性(对称特性和类别不平衡),都影响了模型的性能。针对模型不充分学习和数据特性问题,在深度学习的基础上,提出了一种高性能的RNA碱基相互作用预测方法tpRNA。tpRNA首次在RNA碱基相互作用预测任务中引入迁移学习以改善因数据量少而产生的模型不充分学习问题,并提出高效的损失函数和特征提取模块,充分发挥迁移学习和卷积神经网络在特征学习方面的优势,以缓解数据特性问题。结果表明,引入迁移学习能减小数据量少导致的模型偏差,提出的损失函数能优化模型的训练,特征提取模块能提取到更有效的特征。与最先进的方法相比,tpRNA在低质量输入特征的情形下具有显著的优势。
中图分类号:
[1]ZHANG T,SINGH J,LITFIN T,et al.RNAcmap:A Fully Automatic Pipeline for Predicting Contact Maps of RNAs by Evolutionary Coupling Analysis [J].Bioinformatics,2021,37(20):3494-3500. [2]DE L E,LUTZ B,RATZ S,et al.Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction [J].Nucleic Acids Res,2015,43(21):10444-10455. [3]SUN S,WANG W,PENG Z,et al.RNA inter-nucleotide 3Dcloseness prediction by deep residual neural networks [J].Bioinformatics,2021,37(8):1093-1098. [4]LIU W Y,GUO Y B,LI W H.Identifying Essential Proteins by Hybrid Deep Learning Model [J].Computer Science,2021,48(8):240-245. [5]WU Q,PENG Z,ANISHCHENKO I,et al.Protein contact prediction using metagenome sequence data and residual neural networks [J].Bioinformatics,2020,36(1):41-48. [6]XIE L X,LI F,XIE J P,et al.Predicting Drug Molecular Properties Based on Ensembling Neural Networks Models[J].Compu-ter Science,2021,48(9):251-256. [7]PAN S,YANG Q.A Survey on Transfer Learning[J].IEEETransactions on Knowledge and Data Engineering,2010,22(10):1345-1359. [8]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:770-778. [9]MORCOS F,PAGNANI A,LUNT B,et al.Direct-couplinganalysis of residue coevolution captures native contacts across many protein families [J].Proceedings of the National Academy of Sciences of the United States of America,2011,108(49):E1293-E1301. [10]MARKS D,COLWELL L,SHERIDAN R,et al.Protein 3Dstructure computed from evolutionary sequence variation [J].PLoS One,2011,6(12):e28766. [11]EKBERG M,LOVKVIST C,LAN Y,et al.Improved contactprediction in proteins:using pseudolikelihoods to infer Potts models [J].Physical Review E Statistical Nonlinear & Soft Matter Physics,2013,87(1):012707. [12]JIAN Y,WANG X,QIU J,et al.DIRECT:RNA contact predictions by integrating structural patterns [J].BMC Bioinforma-tics,2019,20(1):497. [13]LI Y,HU J,ZHANG C,et al.ResPRE:high-accuracy proteincontact prediction by coupling precision matrix with deep resi-dual neural networks [J].Bioinformatics,2019,35(22):4647-4655. [14]YU F,KOLTUN V,FUNKHOUSER T.Dilated Residual Networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:636-644. [15]YANI I,DUNCAN R,ROBERTO C,et al.Deep Roots:Improving CNN Efficiency with Hierarchical Filter Groups[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:5977-5986. [16]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:7132-7141. [17]OLAF R,PHILIPP F,THOMAS B.U-Net:Convolutional Networks for Biomedical Image Segmentation [J].Medical Image Computing and Computer-Assisted Intervention,2015,9351:234-241. [18]SANDLER M,HOWARD A,ZHU M L,et al.MobileNetV2:Inverted Residuals and Linear Bottlenecks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:4510-4520. [19]LIN T,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of IEEE International Conference on Computer Vision.Venice:IEEE Press,2017:2999-3007. [20]BERMAN H M,WESTBROOK J,FENG Z,et al.The ProteinData Bank [J].Nucleic Acids Res,2000,28(1):235-242. [21]REMMERT M,BIEGERTA,HAUSER A,et al.HHblits:lightning-fast iterative protein sequence searching by HMM-HMM alignment [J].Nature Methods,2011,9(2):173-175. [22]The UniProt Consortium.UniProt:the universal protein know-ledge base [J].Nucleic Acids Res,2017,45(D1):D158-D169. [23]JONES D T,SINGH T,KOSCIOLEK T,et al.MetaPSICOV:combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins [J].Bioinformatics,2015,31(7):999-1006. [24]JONES D T,KANDATHIL S M.High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features [J].Bioinformatics,2018,34(19):3308-3315. [25]ZHANG C X,ZHENG W,MORTUZA S M,et al.DeepMSA:constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins [J].Bioinformatics,2020,36(7):2105-2112. [26]CHEN M C,LI Y,ZHU Y H,et al.SSCpred:Single-Sequence-Based Protein Contact Prediction Using Deep Fully Convolu-tional Network [J].Journal of Chemical Information and Mode-ling,2020,60(6):3295-3303. |
[1] | 李帅, 徐彬, 韩祎珂, 廖同鑫. SS-GCN:情感增强和句法增强的方面级情感分析模型 SS-GCN:Aspect-based Sentiment Analysis Model with Affective Enhancement and Syntactic Enhancement 计算机科学, 2023, 50(3): 3-11. https://doi.org/10.11896/jsjkx.220700238 |
[2] | 梅鹏程, 杨吉斌, 张强, 黄翔. 一种基于三维卷积的声学事件联合估计方法 Sound Event Joint Estimation Method Based on Three-dimension Convolution 计算机科学, 2023, 50(3): 191-198. https://doi.org/10.11896/jsjkx.220500259 |
[3] | 胡中源, 薛羽, 查加杰. 演化循环神经网络研究综述 Survey on Evolutionary Recurrent Neural Networks 计算机科学, 2023, 50(3): 254-265. https://doi.org/10.11896/jsjkx.220600007 |
[4] | 李俊林, 欧阳智, 杜逆索. 基于改进区域候选网络的场景文本检测 Scene Text Detection with Improved Region Proposal Network 计算机科学, 2023, 50(2): 201-208. https://doi.org/10.11896/jsjkx.211000191 |
[5] | 曹金娟, 钱忠, 李培峰. 基于联合模型的端到端事件可信度识别 End-to-End Event Factuality Identification with Joint Model 计算机科学, 2023, 50(2): 292-299. https://doi.org/10.11896/jsjkx.211200108 |
[6] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[7] | 方义秋, 张震坤, 葛君伟. 基于自注意力机制和迁移学习的跨领域推荐算法 Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning 计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011 |
[8] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[9] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[10] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[11] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[12] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[13] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[14] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[15] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
|