基于transformer的门控双塔模型预测H1N1流感抗原性

doi:10.11896/jsjkx.211000209

Abstract

Abstract: The rapid evolution of influenza virus hemagglutinin protein has led to the continuous production of new virus strains,which may cause seasonal influenza and even global influenza outbreaks.Timely detection of antigen variants is essential for vaccine screening and design.Therefore,a robust predictive model of antigenicity is an effective method to deal with the challenge of vaccines.Various end-to-end feature learning tools provide good feature representation methods for proteomics,but the existing influenza A prediction models cannot effectively extract and utilize features in amino acid sequences.In this paper,a gated two-tower model is designed based on the transformer.By inputting the amino acid sequence of the influenza A virus hemagglutinin protein,two parallel encoders are used to capture the antigenic characteristics from the time and space dimensions of the hemagglutinin protein amino acid sequence,and learn the nonlinear relationship between features and prediction results.In order to reduce the noise in the data,when fusing the features in the time dimension and the space dimension,the weights that measure their relative importance are adaptively obtained through the gate mechanism for selective fusion,and finally the fusion features are used to predict the H1N1 influenza antigen variants.Experimental results on the H1N1 data set show that the use of the model’sexcellent non-linear feature learning ability improves the predictive performance of antigenic variation,and at the same time has good robustness.

Key words: Influenza A, H1N1, Antigenicity prediction, Transformer, Two-tower mode, Gate mechanism

CLC Number:

TP391

LI Chuan, LI Wei-hua, WANG Ying-hui, CHEN Wei, WEN Jun-ying. Gated Two-tower Transformer-based Model for Predicting Antigenicity of Influenza H1N1[J].Computer Science, 2022, 49(11A): 211000209-6.

References

[1]AGOR J K,OZALTIN O Y.Models for predicting the evolution of influenza to inform vaccine strain selection[J].Hum Vaccin Immunother,2018,14(3):678-683.
[2]YIN R,ZHOU X,IVAN F X,et al.Identification of PotentialCritical Virulent Sites Based on Hemagglutinin of Influenza a Virus in Past Pandemic Strains[C]//Proceedings of the 6th International Conference on Bioinformatics and Biomedical Science.Singapore,Association for Computing Machiner,2017:30-36.
[3]NEHER R A,BEDFORD T.Nextflu:real-time tracking of seasonal influenza virus evolution in humans[J].Bioinformatics,2015,31(21):3546-3548.
[4]SAUTTO G A,KIRCHENBAUM G A,ROSS T M.Towards a universal influenza vaccine:different approaches for one goal[J].Virology Journal,2018,15(1):17.
[5]YIN R,LUUSUA E,DABROWSKI J,et al.Tempel:time-series mutation prediction of influenza A viruses via attention-based recurrent neural networks[J].Bioinformatics,2020,36(9):2697-2704.
[6]DE JONG J C,PALACHE A M.Haemagglutination-inhibitingantibody to influenza virus[J].Developments in Biologicals,2003,115:63-73.
[7]SMITH D J,LAPEDES A S,DE JONG J C,et al.Mapping theantigenic andgenetic evolution of influenza virus[J].Science 2004,305(5682):371-376.
[8]LEES W D,MOSS D S,SHEPHERD A J.A computationalanalysis of the antigenic properties of haemagglutinin in influenza A H3N2[J].Bioinformatics,2010,26(11):1403-1408.
[9]LIAO Y C,LEE M S,KO C Y,et al.Bioinformatics models for predicting antigenic variants of influenza A/H3N2 virus[J].Bioinformatics,2008,24(4):505-512.
[10]ZHOU X,YIN R,KWOH C K,et al.A context-free encoding scheme of protein sequences for predicting antigenicity of diverse influenza A viruses[C]//Proceedings of the 29th International Conference on Genome Informatics(GIW 2018):genomics.BMC Genomics,2018:936.
[11]PENG Y,WANG D,WANG J,et al.A universal computational model for predicting antigenic variants of influenza A virus based on conserved antigenic structures[J].Scientific Reports,2017,7:42051
[12]YIN R,TRAN V H,ZHOU X,et al.Predicting antigenic variants of H1N1 influenza virus based on epidemics and pandemics using a stacking model[J/OL].https://doi.org/10.1371/journal.pone.0207777,2018.
[13]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[15]CHO K,VAN MERRIENBOER B,BAHDANAU D.On theproperties of neural machine translation:Encoder-decoder approaches[J].arXiv 2014:14091259.
[16]YIN R,THWIN N N,ZHUANG P,et al.IAV-CNN:a 2D convolutional neural network model to predict antigenic variants of influenza A virus[J].IEEE/ACM Trans Comput Biol Bioinform,2021,9:1-1.
[17]ASWANI A V,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706.03762,2017.
[18]ASGARI E,MOFRAD M R K.ProtVec:A Continuous Distributed Representation of Biological Sequences[J].PloS One,2015,10:11.
[19]MARCAIS G,KINGSFORD C.A fast,lock-free approach for efficient parallel counting of occurrences of k-mers[J].Bioinformatics,2011,27(6):764-770.
[20]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2015.
[21]SRIVASTAVA N,HINTON G,KRIZHEVSKY A.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.
[22]RADZICKA A,WOLFENDEN R.Comparing the polarities of theamino acids:Side-chain distribution coefficients between the vapor phase,cyclohexane,1-octanol,and neutral aqueous sol-ution[J].Biochemistry,1988,27(5):1664-1670.
[23]MEILER J,MLLER M,ZEIDLER A,et al.Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks[J].Mol.Model.Annu.,2001,7(9):360-369.
[24]ATCHLEY W R,ZHAO J,FERNANDES A D,et al.Solvingthe protein sequence metric problem[J].Proc.Nat.Acad.Sci.United States America,2005,102(18):395-6400.
[25]DAYHOFF M O.A model of evolutionary change in proteins[J].Atlas Protein Sequence Structure,1978,5:89-99.
[26]HENIKOFF S,HENIKOFF J G.Amino acid substitution matrices from protein blocks[J].Proceedings of the National Academy of Sciences,1992,89(22):10915-10919.
[27]ALTSCHUL S F,KOONIN E V.Iterated profile searches withPSI-BLAST—a tool for discovery in protein databases[J].Trends in biochemical sciences,1998,23(11):444-447.
[28]MIYAZAWA S,JERNIGAN R L.Self-consistent estimation ofinter-residue protein contact energies based on an equilibrium mixture approximation of residues[J].Proteins:Structure Function Bioinf,1999,34(1):49-68.
[29]MICHELETTI C,SENO F,BANAVAR J R,et al.Learning effective amino acid interactions through iterative stochastic techniques[J].Proteins:Structure Function Bioinf,2001,42(3):422-431.
[30]LIN K,MAY A C W,TAYLOR W R.Amino acidencodingschemes from protein structure alignments:Multi-dimensional vectors to describe residue types[J].Theoretical Biol,2002,216(3):361-365.

Related Articles 15

[1]	WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[2]	KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
[3]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[4]	ZHAO Xiao-hu, YE Sheng, LI Xiao. Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction [J]. Computer Science, 2022, 49(6): 269-275.
[5]	LU Liang, KONG Fang. Dialogue-based Entity Relation Extraction with Knowledge [J]. Computer Science, 2022, 49(5): 200-205.
[6]	WANG Shuai, ZHANG Shu-jun, YE Kang, GUO Qi. Continuous Sign Language Recognition Method Based on Improved Transformer [J]. Computer Science, 2022, 49(11A): 211200198-6.
[7]	HU Xin-rong, CHEN Zhi-heng, LIU Jun-ping, PENG Tao, YE Peng, ZHU Qiang. Sentiment Analysis Framework Based on Multimodal Representation Learning [J]. Computer Science, 2022, 49(11A): 210900107-6.
[8]	WANG Ying-hui, LI Wei-hua, LI Chuan, CHEN Wei, WEN Jun-ying. Prediction of Antigenic Similarity of Influenza A/H5N1 Virus Based on Attention Mechanism and Ensemble Learning [J]. Computer Science, 2022, 49(11A): 210900032-6.
[9]	FANG Zhong-jun, ZHANG Jing, LI Dong-dong. Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning [J]. Computer Science, 2022, 49(10): 151-158.
[10]	YANG Hui-min, MA Ting-huai. Compound Conversation Model Combining Retrieval and Generation [J]. Computer Science, 2021, 48(8): 234-239.
[11]	YANG Jin-cai, CAO Yuan, HU Quan, SHEN Xian-jun. Relation Classification of Chinese Causal Compound Sentences Based on Transformer Model and Relational Word Feature [J]. Computer Science, 2021, 48(6A): 295-298.
[12]	HUO Shuai, PANG Chun-jiang. Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network [J]. Computer Science, 2021, 48(6A): 349-356.
[13]	WANG Shi-hao, WANG Zhong-qing, LI Shou-shan, ZHOU Guo-dong. Event Argument Extraction Using Gated Graph Convolution and Dynamic Dependency Pooling [J]. Computer Science, 2021, 48(11A): 52-56.
[14]	JIANG Qi, SU Wei, XIE Ying, ZHOUHONG An-ping, ZHANG Jiu-wen, CAI Chuan. End-to-End Chinese-Braille Automatic Conversion Based on Transformer [J]. Computer Science, 2021, 48(11A): 136-141.
[15]	LI Feng and XIA Li. Transformer Fault Monitoring Expert System Based on Rule Base [J]. Computer Science, 2016, 43(Z11): 564-567.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Gated Two-tower Transformer-based Model for Predicting Antigenicity of Influenza H1N1

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0