计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230700112-8.doi: 10.11896/jsjkx.230700112
杨滨瑕, 罗旭东, 孙凯丽
YANG Binxia, LUO Xudong, SUN Kaili
摘要: 自然语言处理涉及许多重要主题,其中之一是机器翻译。预训练语言模型,如BERT和GPT,是用于处理包括机器翻译在内的各种自然语言处理任务的先进方法。因此,许多研究人员使用预训练语言模型来解决机器翻译问题。为推动研究向前发展,首先概述了这一领域的最新进展,包括主要的研究问题和基于各种预训练语言模型的解决方案;其次比较了这些解决方案的动机、共性、差异和局限性;然后总结了训练这类机器翻译模型常用的数据集,以及评估这些模型的指标;最后讨论了进一步的研究方向。
中图分类号:
[1]BAHDANAU D,CHO K H,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//3rd International Conference on Learning Representations(ICLR).2015. [2]ZHANG Z,WU S,JIANG D,et al.BERT-JAM:Maximizing the utilization of BERT for neural machine translation[J].Neurocomputing,2021,460:84-94. [3]SUN K,LUO X,LUO M Y.A survey of pretrained language models[C]//International Conference on Knowledge Science,Engineering and Management.Cham:Springer International Publishing,2022:442-456. [4]RIVERA-TRIGUEROS I.Machine translation systems andquality assessment:a systematic review[J].Language Resources and Evaluation,2022,56(2):593-619. [5]RANATHUNGA S,LEE E S A,SKENDULI M P,et al.Neural machine translation for low-resource languages:A survey[J].ACM Computing Surveys,2023,55(11):1-37. [6]KENTON,DEVLIN J,CHANG M W,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186. [7]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Impro-ving language understanding by generative pre-training[EB/OL].https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf. [8]CONNEAU A,LAMPLE G.Cross-lingual language model pretraining[C]//Advances in Neural Information Processing Systems.2019,32. [9]LEWIS M,LIU Y,GOYAL N,et al.BART:Denoising Se-quence-to-Sequence Pre-training for Natural Language Generation,Translation,and Comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880. [10]YANG J,WANG M,ZHOU H,et al.Towards making the most of BERT in neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:9378-9385. [11]ZHU J,XIA Y,WU L,et al.Incorporating bert into neural machine translation[C]//8rd International Conference on Learning Representations(ICLR 2020).2020. [12]ZHANG J R,LI H Z,SHI S M,et al.Dynamic Attention Aggre-gation with BERT for Neural Machine Translation[C]//2020 International Joint Conference on Neural Networks(IJCNN).IEEE,2020:1-8. [13]SHAVARANI H S,SARKAR A.Better Neural Machine Translation by Extracting Linguistic Information from BERT[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:2772-2783. [14]GUO J,ZHANG Z,XU L,et al.Adaptive adapters:An efficient way to incorporate BERT into neural machine translation[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:1740-1751. [15]TRAN K.From English to foreign languages:Transferring pre-trained language models[EB/OL].https://arxiv.org/abs/2002.07306. [16]MIYAZAKI T,MORITA Y,SANO M.Machine translationfrom spoken language to Sign language using pre-trained language model as encoder[C]//Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages:Sign Language Resources in the Service of the Language Community,Technological Challenges and Application Perspectives.2020:139-144. [17]ÜSTÜN A,BÉRARD A,BESACIER L,et al.Multilingual Unsupervised Neural Machine Translation with Denoising Adapters[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:6650-6662. [18]BRISKILAL J,SUBALALITHA C N.An ensemble model forclassifying idioms and literal texts using BERT and RoBERTa[J].Information Processing & Management,2022,59(1):102756. [19]LIU J,THOMA S.GERMAN TO ENGLISH:Fake News De-tection with Machine Translation[J].Lecture Notes in Informatics,Gesellschaft für Informatik,2022,3457:1-8. [20]ZHANG Z,WU S,JIANG D,et al.BERT-JAM:Maximizing the utilization of BERT for neural machine translation[J].Neurocomputing,2021,460:84-94. [21]HAN J M,BABUSCHKIN I,EDWARDS H,et al.Unsupervised neural machine translation with generative language models only[EB/OL].https://arxiv.org/abs/2110.05448. [22]TAN Z,ZHANG X,WANG S,et al.MSP:Multi-Stage Promp-ting for Making Pre-trained Language Models Better Translators[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2022:6131-6142. [23]WENG R,YU H,HUANG S,et al.Acquiring knowledge from pre-trained model to neural machine translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:9266-9273. [24]ZHANG B,NAGESH A,KNIGHT K.Parallel Corpus Filtering via Pre-trained Language Models[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:8545-8554. [25]SAWAI R,PAIK I,KUWANA A.Sentence Augmentation forLanguage Translation Using GPT-2[J].Electronics,2021,10(24):3082. [26]RUBINO R,SUMITA E.Intermediate self-supervised learningfor machine translation quality estimation[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:4355-4360. [27]LI Z,ZHAO H,WANG R,et al.SJTU-NICT’s Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task[C]//Proceedings of the Fifth Conference on Machine Translation.2020:218-229. [28]CHEN G,MA S,CHEN Y,et al.Zero-Shot Cross-LingualTransfer of Neural Machine Translation with Multilingual Pretrained Encoders[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:15-26. [29]MA S,DONG L,HUANG S,et al.Deltalm:Encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encoders[EB/OL].https://arxiv.org/abs/2106.13736. [30]SUN X,GE T,MA S,et al.A unified strategy for multilingual grammatical error correction with pre-trained cross-lingual language mode[EB/OL].https://arxiv.org/abs/2201.10707. [31]LIU Y,GU J,GOYAL N,et al.Multilingual denoising pre-trai-ning for neural machine translation[J].Transactions of the Association for Computational Linguistics,2020,8:726-742. [32]WANG X,TU Z,SHI S.Tencent AI lab machine translation systems for the WMT21 biomedical translation task[C]//Proceedings of the Sixth Conference on Machine Translation.2021:874-878. [33]DABRE R,SHROTRIYA H,KUNCHUKUTTAN A,et al.IndicBART:A Pre-trained Model for Indic Natural Language Ge-neration[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:1849-1863. [34]RIPPETH E,AGRAWAL S,CARPUAT M.Controlling Translation Formality Using Pre-trained Multilingual Language Mo-dels[C]//Proceedings of the 19th International Conference on Spoken Language Translation(IWSLT 2022).2022:327-340. [35]LOIC B,MAGDALENA B,ONDREJ B,et al.Findings of the 2020 conference on machine translation[C]//Proceedings of the Fifth Conference on Machine Translation.2020:1-55. [36]KOEHN P.Europarl:A parallel corpus for statistical machinetranslation[C]//Proceedings of machine translation summit x:papers.2005:79-86. [37]ZIEMSKI M,JUNCZYS-DOWMUNT M,POULIQUEN B.The united nations parallel corpus v1.0[C]//Proceedings of the Tenth International Conference on Language Resources and Evaluation(LREC’16).2016:3530-3534. [38]LISON P,TIEDEMANN J.Open Subtitles 2016:extractinglarge parallel corpora from movie and TV subtitles[C]//10th Conference on International Language Resources and Evaluation(LREC’16).European Language Resources Association,2016:923-929. [39]BAÑÓN M,CHEN P,HADDOW B,et al.ParaCrawl:Web-scaleacquisition of parallel corpora[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:4555-4567. [40]VASWANI A,BENGIO S,BREVDO E,et al.Tensor2Tensorfor Neural Machine Translation[C]//Proceedings of the 13th Conference of the Association for Machine Translation in the Americas(Volume 1:Research Track).2018:193-199. [41]EISELE A,CHEN Y.MultiUN:A multilingual corpus fromunited nation documents[C]//LREC.2010. [42]KWON S,GO B H,LEE J H.A text-based visual context modulation neural model for multimodal machine translation[J].Pattern Recognition Letters,2020,136:212-218. |
|