计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 375-382.doi: 10.11896/jsjkx.250100005
喻定, 李章维
YU Ding, LI Zhangwei
摘要: RNA二级结构预测是生物信息学中的核心问题,近年来,深度学习技术的发展为该领域带来了显著进步。然而,现有方法在预测精度和对外部先验模型的依赖性方面仍存在不足,这些限制可能对模型的鲁棒性和泛化能力造成影响。针对上述问题,提出了一种基于Transformer架构的RNA二级结构预测模型。该模型设计了两条特征编码通路,通过线性嵌入和独热编码生成序列特征,并利用交叉注意力机制高效融合两种特征表示。在特征提取阶段,模型采用改进的Swin-Transformer与U-Net相结合的架构(Swin-UNet),实现深层次特征提取,并最终生成RNA二级结构配对概率矩阵。实验结果表明,该模型在多个标准数据集上的F1得分领先了其他模型3%以上,且无须依赖外部模型的先验信息。研究结果为RNA结构预测提供了新的解决方案,同时展现了Transformer架构在生物序列分析中的广阔前景。
中图分类号:
| [1]SUN J,ZHANG J,FANG X Y.Research progress on RNAhigh-order structure determination[J].Chemistry of Life,2024,44(9):1638-1649. [2]ZENG C W,ZHAO Y J.Advances in RNA-protein structureprediction[J].Science China Physics,Mechanics & Astronomy,2023,53(9):222-232. [3]CHEN Z R,HUANG J H,LI B,et al.Applications of Computational Biology inRNA Research[J].Science China:Life Scien-ces,2024,54(4):668-693. [4]DONG Y Y,PENG Q,WANG M,et al.Research progress on the replication mechanisms of important human-infecting RNA viruses and polymerase-directed drug development[J].Biome-dical Transformation,2024,5(1):2-11. [5]LING X Y,LIU R.The impact of X-ray diffraction crystallography on DNA double helix[J].Emerging Science and Technology,2023,2(1):9-18. [6]PAN Z L,JIA X Y,SU Z M.Recent advances in RNA cryo-EM structure determination[J].Science China:Life Sciences,2024,54(8):1424-1438. [7]SLOMA M F,ZUKER M,MATHEWS D H.Predictive me-thods using RNA sequences[M]//RNA Structure and Folding.New York:Humana Press,2014:27-43. [8]DING Y,LAWRENCE C E.A statistical sampling algorithm for RNA secondary structure prediction[J].Nucleic Acids Research,2003,31(24):7280-7301. [9]SHAPIRO B A,NAVETTA J.A massively parallel genetic algorithm for RNA secondary structure prediction[J].The Journal of Supercomputing,1994,8:195-207. [10]CHEN J H,LE S Y,MAIZEL J V.Prediction of common se-condary structures of RNAs:a genetic algorithm approach[J].Nucleic Acids Research,2000,28(4):991-999. [11]GEIS M,MIDDENDORF M.Particle swarm optimization forfinding RNA secondary structures[J].International Journal of Intelligent Computing and Cybernetics,2011,4(2):160-186. [12]DO C B,WOODS D A,BATZOGLOU S.CONTRAfold:RNA secondary structure prediction without physics-based models[J].Bioinformatics,2006,22(14):e90-e98. [13]ZAKOV S,GOLDBERG Y,ELHADAD M,et al.Rich parameterization improves RNA structure prediction[J].Journal of Computational Biology,2011,18(11):1525-1542. [14]ZHANG H,ZHANG C,LI Z,et al.A new method of RNA se-condary structure prediction based on convolutional neural network and dynamic programming[J].Frontiers in Genetics,2019,10:467. [15]QUAN L,CAI L,CHEN Y,et al.Developing parallel ant colonies filtered by deep learned constrains for predicting RNA se-condary structure with pseudo-knots[J].Neurocomputing,2020,384:104-114. [16]CHEN X,LI Y,UMAROV R,et al.RNA Secondary Structure Prediction By Learning Unrolled Algorithms[C]//Proceedings of the 2020 International Conference on Learning Representations(ICLR).2020:1-19. [17]SINGH J,HANSON J,PALIWAL K,et al.RNA secondarystructure prediction using an ensemble of two-dimensional deep neural networks and transfer learning[J].Nature Communications,2019,10(1):5407. [18]FU L,CAO Y,WU J,et al.UFold:fast and accurate RNA se-condary structure prediction with deep learning[J].Nucleic Acids Research,2022,50(3):e14. [19]YANG E,ZHANG H,ZANG Z,et al.GCNfold:A novel lightweight model with valid extractors for RNA secondary structure prediction[J].Computers in Biology and Medicine,2023,164:107246. [20]KANG L,SU Z J.Principle and prospect of image data enhancement technology[J].Information Technology,2024(9):176-185. [21]LYU J,WANG Z Y.A Survey of Retinal Vessel Segmentation Algorithms Based on Deep Learning[J].Journal of Chongqing Normal University(Natural Science),2024,41(4):110-125. [22]PAN X,GE C,LU R,et al.On the integration of self-attention and convolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:815-825. [23]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-assisted Intervention-MICCAI 2015:18th International Conference.Springer,2015:234-241. [24]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022. [25]HENDRYCKS D,GIMPEL K.Gaussian error linear units(gelus)[J].arXiv:1606.08415,2016. [26]TAN Z,FU Y,SHARMA G,et al.TurboFold II:RNA structuralalignment and secondary structure prediction informed by multiple homologs[J].Nucleic Acids Research,2017,45(20):11570-11581. [27]SLOMA M F,MATHEWS D H.Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures[J].RNA,2016,22(12):1808-1818. [28]ZUKER M.Mfold web server for nucleic acid folding and hybridization prediction[J].Nucleic Acids Research,2003,31(13):3406-3415. [29]YUAN Y,YANG E,ZHANG R.Wfold:A new method for predicting RNA secondary structure with deep learning[J].Computers in Biology and Medicine,2024,182:109207. [30]LORENZ R,BERNHART S H,HÖNER ZU SIEDERDISSEN C,et al.ViennaRNA Package 2.0[J].Algorithms for Molecular Biology,2011,6:1-14. |
|
||