计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 210900032-6.doi: 10.11896/jsjkx.210900032

• 人工智能 • 上一篇    下一篇

基于注意力机制与集成学习的甲型H5N1流感病毒抗原相似性预测

王迎晖, 李维华, 李川, 陈伟, 文俊颖   

  1. 云南大学信息学院 昆明 650503
  • 出版日期:2022-11-10 发布日期:2022-11-21
  • 通讯作者: 李维华(lywey@163.com)
  • 作者简介:(742854584@qq.com)
  • 基金资助:
    国家自然科学基金(32060151)

Prediction of Antigenic Similarity of Influenza A/H5N1 Virus Based on Attention Mechanism and Ensemble Learning

WANG Ying-hui, LI Wei-hua, LI Chuan, CHEN Wei, WEN Jun-ying   

  1. School of Information Science and Engineering,Yunnan University,Kunming 650503,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:WANG Ying-hui,born in 1997,postgraduate.His main research interests include deep learning and bioinformatics.
    LI Wei-hua,born in 1977,Ph.D,asso-ciate professor.Her main research in-terests include data mining and bioinformatics.
  • Supported by:
    National Natural Science Foundation of China(32060151).

摘要: 甲型流感病毒可能导致季节性流感病毒疫情甚至全球大爆发。流感病毒血凝素蛋白的持续和累积变化会产生新的抗原株,致使疫苗效力降低甚至失效。抗原相似性预测对流感疫情监测和疫苗选择是至关重要的。甲型H5N1病毒源于禽类,可引起人类肺炎和多器官衰竭。针对流感病毒及其抗原特点,设计一个预测病毒抗原相似性的神经网络模型,该模型分别基于K-mer嵌入与位置特异性矩阵表示序列信息并进行融合;在此基础上,设计融合注意力机制的集成深度学习模型用于抗原相似性预测。实验结果表明,相比基准模型,该模型显著提高了模型预测的准确率、精确率、F1值和MCC值。从实验中可以看出该模型具有良好的鲁棒性和扩展性,在抗原相似性预测领域有很好的应用潜力。

关键词: 抗原性相似性, 甲型流感, H5N1, 集成学习, 注意力机制

Abstract: Influenza A virus can lead to seasonal influenza virus outbreaks or even global outbreaks.Continued and cumulative changes in the hemagglutinin protein of influenza viruses can lead to the antigenic variants that reduce vaccine effectiveness or even cause vaccine failure.Therefore,antigenic similarity prediction is critical for influenza outbreak surveillance and vaccine selection.Although A/H5N1 virus originates in poultry,they can cause pneumonia and multiple organ failure in humans.In view of influenza virus and the antigenic characteristics,this paper designs a neural network model to predict the antigenic similarity between viruses.Specifically,the model represents amino acid sequences based on the K-mer embedding and position specific scoring matrices(PSSM),then integrates the features.Furthermore,integrated deep learning model fused with attention mechanism for antigen similarity prediction.Experimental results show that the model significantly improves the accuracy,precision,F1 and MCC compares with the baseline models.Experimental results show that the model has good robustness and extensibility,and has good application potential in the field of antigenic similarity prediction.

Key words: Antigenic similarity, Influenza A, H5N1, Ensemble learning, Attention mechanism

中图分类号: 

  • TP391
[1]MEDINA R A,GARCIA-SASTRE A.Influenza A viruses:new research developments [J].Nat RevMicrobiol,2011,9(8):590-603.
[2]THOMPSON W W,SHAY D K,WEINTRAUB E,et al.Mortality Associated With Influenza and Respiratory Syncytial Virus in the United States [J].JAMA,2003,289(2):179-186.
[3]CHAN P K S.Outbreak of Avian Influenza A(H5N1) Virus Infection in Hong Kong in 1997 [J].Clinical Infectious Diseases,2002,34(Supplement_2):S58-S64.
[4]RUSSELL C,JONES T,BARR I,et al.The global circulation of seasonal influenza A(H3N2) viruses [J].Science,2008,320(5874):340-346.
[5]DE JONG M D,SIMMONS C P,THANH T T,et al.Fatal outcome of human influenza A(H5N1) is associated with high viral load and hypercytokinemia [J].Nature Medicine,2006,12(10):1203-1207.
[6]BAUER TT,EWIG S,RODLOFF A C,et al.Acute Respiratory Distress Syndrome and Pneumonia:A Comprehensive Review of Clinical Data [J].Clinical Infectious Diseases,2006,43(6):748-756.
[7]PEIRIS J S M,CHEUNG C Y,LEUNG C Y H,et al.Innate immune responses to influenza A H5N1:friend or foe? [J].Trends in Immunology,2009,30(12):574-584.
[8]SUN H,YANG J,ZHANG T,et al.Using Sequence DataTo Infer the Antigenicity of Influenza Virus [J].mBio,2013,4(4):e00230-00213.
[9]SMITH D J,LAPEDES A S,DE JONG J C,et al.Mapping the Antigenic and Genetic Evolution of Influenza Virus [J].Science,2004,305(5682):371-376.
[10]CAI Z,ZHANG T,WAN X F.A computational framework for influenza antigenic cartography [J].PLoS Comput Biol,2010,6(10):e1000949.
[11]WANG P,ZHU W,LIAO B,et al.Predicting Influenza Antigeni-city by Matrix CompletionWith Antigen and Antiserum Similari-ty[J].Frontiers in Microbiology,2018,9:2500.
[12]PLOTKIN J B,DUSHOFF J.Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus [J].Proc.Natl.Acad.Sci.USA,2003,100(12):7152-7157.
[13]PENG Y,WANG D,WANG J,et al.A universal computational model for predicting antigenic variants of influenza A virus based on conserved antigenic structures [J].Sci Rep,2017,7:42051.
[14]LIAO Y C,LEE M S,KO C Y,et al.Bioinformatics models for predicting antigenic variants of influenza A/H3N2 virus [J].Bioinformatics,2008,24(4):505-512.
[15]LEES W D,MOSS D S,SHEPHERD A J.A computationalanalysis of the antigenic properties of haemagglutinin in influenza A H3N2 [J].Bioinformatics,2010,26(11):1403-1408.
[16]REN X,LI Y,LIU X,et al.Computational Identification of Antigenicity-Associated Sites in the Hemagglutinin Protein of A/H1N1 Seasonal Influenza Virus [J].PLoS One,2015,10(5):e0126742.
[17]ZENG M,LI M,FEI Z,et al.A Deep Learning Framework forIdentifying Essential Proteins by Integrating Multiple Types of Biological Information [J].IEEE/ACM Trans. Comput. Biol Bioinform,2021,18(1):296-305.
[18]SPENCER M,EICKHOLT J,JIANLIN C.A Deep LearningNetwork Approach to ab initio Protein Secondary Structure Prediction [J].IEEE/ACM Trans Comput Biol Bioinform,2015,12(1):103-112.
[19]YIN R,ZHANG Y,ZHOU X,et al.Time series computational prediction of vaccines for influenza A H3N2 with recurrent neural networks [J].Journal of Bioinformatics and Computational Biology,2020,18(1):2040002.
[20]YI H C,YOU Z H,WANG M N,et al.RPI-SE:a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information [J].BMC Bioinformatics,2020,21(1):60.
[21]ZHANG B,LI J,LÜ Q.Prediction of 8-state protein secondary structures by a novel deep learning architecture [J].BMC Bioinformatics,2018,19(1):293.
[22]XU H J,YANG Y,LI G L.Material Recognition Method Based on Attention Mechanism and Deep Convolutional Neural Network[J].Computer Science,2021,48(10):220-225.
[23]LI F,ZHU F,LING X,et al.Protein Interaction Network Reconstruction Through Ensemble Deep LearningWith Attention Mechanism [J].Frontiers in Bioengineering and Biotechnology,2020,8(390).
[24]ASGARIE,MOFRAD M R.Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics [J].PLoS One,2015,10(11):e0141287.
[25]NG P.dna2vec:Consistent vector representations of variable-length k-mers [J].:arXiv:1701.06279,2017.
[26]QIU J,QIU T,YANG Y,et al.Incorporating structure context of HA protein to improve antigenicity calculation for influenza virus A/H3N2 [J].Sci Rep,2016,6:31156.
[27]HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:770-778.
[28]LING W,DYER C,BLACK A W,et al.Two/Too Simple Adaptations of Word2Vec for Syntax Problems[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Association for Computational Linguistics,2015:1299-1304.
[29]ALTSCHUL S F,MADDEN T L,SCHÉÄFFER A A,et al.Gapped BLAST and PSI-BLAST:a new generation of protein database search programs [J].Nucleic Acids Research,1997,25(17):3389-3402.
[30]HE J C,LI L,XU J C,et al.Relu Deep Neural Networks and Linear Finite Elements [J].Journal of Computational Mathematics,2020,38(3):502-527.
[31]HOCHREITER S,SCHMIDHUBER J.Long Short-TermMemory [J].Neural Computation,1997,9(8):1735-1780.
[32]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module [C]//Proceedings of the ECCV.2018:3-19.
[33]NDIFON W,DUSHOFF J,LEVIN S A.On the use of hemagglutination-inhibition for influenza surveillance:surveillance data are predictive of influenza vaccine effectiveness [J].Vaccine,2009,27(18):2447-2452.
[34]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization [J].arXiv:1412.6980,2014.
[35]YIN R,THWIN N N,ZHUANG P,et al.IAV-CNN:a 2D convolutional neural network model to predict antigenic variants of influenza A virus [J].IEEE/ACM Transactions on Computational Beology and Bioinformatics,2021,9:1-1.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[7] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[8] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[9] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[10] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[11] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[12] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[13] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[14] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[15] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!