Computer Science ›› 2023, Vol. 50 ›› Issue (3): 164-172.doi: 10.11896/jsjkx.211200186

• Database & Big Data & Data Science • Previous Articles     Next Articles

Improving RNA Base Interactions Prediction Based on Transfer Learning and Multi-view Feature Fusion

WANG Xiaofei, FAN Xueqiang, LI Zhangwei   

  1. College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China
  • Received:2021-12-16 Revised:2022-05-11 Online:2023-03-15 Published:2023-03-15
  • About author:WANG Xiaofei,born in 1995,postgra-duate.His main research interests include computer vision and bioinforma-tics.
    LI Zhangwei,born in1967,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include intelligent information processing and so on.
  • Supported by:
    National Natural Science Foundation of China(61573317).

Abstract: RNA base interactions play an important role in maintaining the stability of its three-dimensional structure,and accurate prediction of base interactions can help predict the three-dimensional structure of RNA.However,due to the small amount of data,the model could not effectively learn the feature distribution of the training data,and existing data characteristics(symmetry and class imbalance) affect the performance of the RNA base interactions prediction model.Aiming at the problems of insufficient model learning and data characteristics,a high-performance RNA base interactions prediction method called tpRNA is proposed based on deep learning.tpRNA introduces transfer learning in RNA base interactions prediction task to weak the influence of insufficient learning in the training process due to the small amount of data,and an efficient loss function and feature extraction module is proposed to give full play to the advantages of transfer learning and convolutional neural network in feature learning to alleviate the problem of data characteristics.Results show that transfer learning can reduce the model deviation caused by less data,the proposed loss function can optimize the model training,and the feature extraction module can extract more effective features.Compared with the state-of-the-art method,tpRNA also has significant advantages in the case of low-quality input features.

Key words: RNA base interactions, Transfer learning, Data characteristic, Loss function, Convolutional neural networks

CLC Number: 

  • TP301
[1]ZHANG T,SINGH J,LITFIN T,et al.RNAcmap:A Fully Automatic Pipeline for Predicting Contact Maps of RNAs by Evolutionary Coupling Analysis [J].Bioinformatics,2021,37(20):3494-3500.
[2]DE L E,LUTZ B,RATZ S,et al.Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction [J].Nucleic Acids Res,2015,43(21):10444-10455.
[3]SUN S,WANG W,PENG Z,et al.RNA inter-nucleotide 3Dcloseness prediction by deep residual neural networks [J].Bioinformatics,2021,37(8):1093-1098.
[4]LIU W Y,GUO Y B,LI W H.Identifying Essential Proteins by Hybrid Deep Learning Model [J].Computer Science,2021,48(8):240-245.
[5]WU Q,PENG Z,ANISHCHENKO I,et al.Protein contact prediction using metagenome sequence data and residual neural networks [J].Bioinformatics,2020,36(1):41-48.
[6]XIE L X,LI F,XIE J P,et al.Predicting Drug Molecular Properties Based on Ensembling Neural Networks Models[J].Compu-ter Science,2021,48(9):251-256.
[7]PAN S,YANG Q.A Survey on Transfer Learning[J].IEEETransactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[8]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:770-778.
[9]MORCOS F,PAGNANI A,LUNT B,et al.Direct-couplinganalysis of residue coevolution captures native contacts across many protein families [J].Proceedings of the National Academy of Sciences of the United States of America,2011,108(49):E1293-E1301.
[10]MARKS D,COLWELL L,SHERIDAN R,et al.Protein 3Dstructure computed from evolutionary sequence variation [J].PLoS One,2011,6(12):e28766.
[11]EKBERG M,LOVKVIST C,LAN Y,et al.Improved contactprediction in proteins:using pseudolikelihoods to infer Potts models [J].Physical Review E Statistical Nonlinear & Soft Matter Physics,2013,87(1):012707.
[12]JIAN Y,WANG X,QIU J,et al.DIRECT:RNA contact predictions by integrating structural patterns [J].BMC Bioinforma-tics,2019,20(1):497.
[13]LI Y,HU J,ZHANG C,et al.ResPRE:high-accuracy proteincontact prediction by coupling precision matrix with deep resi-dual neural networks [J].Bioinformatics,2019,35(22):4647-4655.
[14]YU F,KOLTUN V,FUNKHOUSER T.Dilated Residual Networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:636-644.
[15]YANI I,DUNCAN R,ROBERTO C,et al.Deep Roots:Improving CNN Efficiency with Hierarchical Filter Groups[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:5977-5986.
[16]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:7132-7141.
[17]OLAF R,PHILIPP F,THOMAS B.U-Net:Convolutional Networks for Biomedical Image Segmentation [J].Medical Image Computing and Computer-Assisted Intervention,2015,9351:234-241.
[18]SANDLER M,HOWARD A,ZHU M L,et al.MobileNetV2:Inverted Residuals and Linear Bottlenecks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:4510-4520.
[19]LIN T,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of IEEE International Conference on Computer Vision.Venice:IEEE Press,2017:2999-3007.
[20]BERMAN H M,WESTBROOK J,FENG Z,et al.The ProteinData Bank [J].Nucleic Acids Res,2000,28(1):235-242.
[21]REMMERT M,BIEGERTA,HAUSER A,et al.HHblits:lightning-fast iterative protein sequence searching by HMM-HMM alignment [J].Nature Methods,2011,9(2):173-175.
[22]The UniProt Consortium.UniProt:the universal protein know-ledge base [J].Nucleic Acids Res,2017,45(D1):D158-D169.
[23]JONES D T,SINGH T,KOSCIOLEK T,et al.MetaPSICOV:combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins [J].Bioinformatics,2015,31(7):999-1006.
[24]JONES D T,KANDATHIL S M.High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features [J].Bioinformatics,2018,34(19):3308-3315.
[25]ZHANG C X,ZHENG W,MORTUZA S M,et al.DeepMSA:constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins [J].Bioinformatics,2020,36(7):2105-2112.
[26]CHEN M C,LI Y,ZHU Y H,et al.SSCpred:Single-Sequence-Based Protein Contact Prediction Using Deep Fully Convolu-tional Network [J].Journal of Chemical Information and Mode-ling,2020,60(6):3295-3303.
[1] HU Zhongyuan, XUE Yu, ZHA Jiajie. Survey on Evolutionary Recurrent Neural Networks [J]. Computer Science, 2023, 50(3): 254-265.
[2] MEI Pengcheng, YANG Jibin, ZHANG Qiang, HUANG Xiang. Sound Event Joint Estimation Method Based on Three-dimension Convolution [J]. Computer Science, 2023, 50(3): 191-198.
[3] LI Junlin, OUYANG Zhi, DU Nisuo. Scene Text Detection with Improved Region Proposal Network [J]. Computer Science, 2023, 50(2): 201-208.
[4] FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77.
[5] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[6] MENG Yue-bo, MU Si-rong, LIU Guang-hui, XU Sheng-jun, HAN Jiu-qiang. Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism [J]. Computer Science, 2022, 49(7): 142-147.
[7] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[8] GAO Rong-hua, BAI Qiang, WANG Rong, WU Hua-rui, SUN Xiang. Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism [J]. Computer Science, 2022, 49(6A): 363-369.
[9] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[10] SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[11] PENG Yun-cong, QIN Xiao-lin, ZHANG Li-ge, GU Yong-xiang. Survey on Few-shot Learning Algorithms for Image Classification [J]. Computer Science, 2022, 49(5): 1-9.
[12] TAN Zhen-qiong, JIANG Wen-Jun, YUM Yen-na-cherry, ZHANG Ji, YUM Peter-tak-shing, LI Xiao-hong. Personalized Learning Task Assignment Based on Bipartite Graph [J]. Computer Science, 2022, 49(4): 269-281.
[13] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[14] ZHANG Shu-meng, YU Zeng, LI Tian-rui. Transferable Emotion Analysis Method for Cross-domain Text [J]. Computer Science, 2022, 49(3): 218-224.
[15] WANG Xian-sheng, YAN Ke. Fault Detection and Diagnosis of HVAC System Based on Federated Learning [J]. Computer Science, 2022, 49(12): 74-80.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!