蒙汉神经机器翻译研究综述

doi:10.11896/jsjkx.210900006

摘要/Abstract

摘要： 机器翻译是利用计算机将一种语言转换成另一种语言的过程,凭借着对语义的深度理解能力,神经机器翻译已经成为目前主流的机器翻译方法,在众多拥有大规模对齐语料的翻译任务上取得了令人瞩目的成就,然而对于一些低资源语言的翻译任务效果仍不理想。蒙汉机器翻译是目前国内主要的低资源机器翻译研究之一,蒙汉两种语言的翻译并不简单地是两种语言的相互转换,更是两个民族之间的交流,因此受到国内外的广泛关注。文中主要对蒙汉神经机器翻译的发展历程和研究现状进行阐述,随后选取了近年来蒙汉神经机器翻译研究的前沿方法,包括基于无监督学习和半监督学习的数据增强方法、强化学习方法、对抗学习方法、迁移学习方法和预训练模型辅助的神经机器翻译方法等,并对这些方法进行了简要介绍。

关键词: 半监督/无监督学习, 对抗学习, 监督方法, 蒙汉机器翻译, 迁移学习, 强化学习, 预训练模型

Abstract: Machine translation is the process of using a computer to convert one language into another language.With the deep understanding of semantics,neural machine translation has become the most mainstream machine translation method at present,and it has made remarkable achievements in many translation tasks with large-scale alignment corpus,but the effect of translation tasks for some low-resource languages is still not ideal.Mongolian-Chinese machine translation is currently one of the main low-resource machine translation studies in China.The translation of Mongolian and Chinese languages is not simply the conversion between the two languages,but also the communication between the two nations,so it has attracted wide attention at home and abroad.This thesis mainly expounds the development process and research status of Mongolian-Chinese neural machine translation,and then selects the frontier methods of Mongolian-Chinese neural machine translation research in recent years,including data augmentation methods based on unsupervised lear-ning and semi-supervised learning,reinforcement learning,adversarial lear-ning,transfer-learning and neural machine translation methods assisted by pre-training models,etc.,and briefly introduce these methods.

Key words: Adversarial learning, Mongolian-Chinese machine translation, Pre-training model, Reinforcement learning, Semi-supervised/Unsupervised learning, Supervised method, Transfer-learning

中图分类号:

TP391

侯宏旭, 孙硕, 乌尼尔. 蒙汉神经机器翻译研究综述[J]. 计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006

HOU Hong-xu, SUN Shuo, WU Nier. Survey of Mongolian-Chinese Neural Machine Translation[J]. Computer Science, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006

参考文献

[1]LI Y C,XIONG D Y,ZHANG M.A Survey of Neural Machine Translation[J].Chinese Journal of Computer,2018,12(1):1-25.
[2]NASHUN W,LIU Q,BADAMA O.Chinese-Mongolian Ma-chine Aided Translation System[J].Journal of the Altai Society of Korea,2001,11.
[3]LOPEZ A.Statistical Machine Translation[J].ACM Computing Surveys,2008,40(3):8.
[4]CHIANG D.A hierarchical phrase-based model for statisticalmachine translation[C]//Proceedings of the Conference 43rd Annual Meeting of the Association for Computational Linguistics.2005:25-30.
[5]ZONG C Q.Statistical Natural Language Processing[M].Tsinghua University Press,2013.
[6]YIN H,WANG S R,GU L,et al.Design and implementation of a phrase-based statistical machine translation system for Mongolian-Chinese[J].Inner Mongolia Normal University (Natural Science Chinese Edition),2011,40(1):91-94.
[7]WU J,HOU H,CONGJIAO X.Realignment from Finer-grained Alignment to Coarser-grained Alignment to Enhance Mongolian-Chinese SMT[C]//Proceedings of the 29th Pacific Asia Confe-rence on Language,Information and Computation:Posters.2015:89-95.
[8]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequencelearning with neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112.
[9]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//3rd International Conference on Learning Representations(ICLR 2015).2015.
[10]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421.
[11]GEHRING J,AULI M,GRANGIER D,et al.Convolutional Sequence to Sequence Learning[C]//International Conference on Machine Learning.2017:1243-1252.
[12]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[13]FAN W T,HOU H X,WANG H B,et al.Mongolian-Chinese neural network machine translation model fused with prior information[J].Journal of Chinese Information Processing,2018,32(6):36-43.
[14]LIU W W,SU Y L,WU N,et al.Research on Mongolian-Chinese machine translation based on LSTM[J].Computer Engineering and Science,2018,40(10):1890-1896.
[15]WU N,SU Y L,LIU W W,et al.Mongolian-Chinese Machine Translation Based on CNN Etyma Morphological Selection Model[J].Journal of Chinese Information Processing,2018,32(5):42-48.
[16]WANG H B.Research on Multi-granularity Mongolian-Chinese Neural Network Machine Translation[D].Hohhot:Inner Mongolia University,2018.
[17]WU J.Research of optimization methods integration and translation rerank for Mongolian-Chinese machine translation [D].Hohhot:Inner Mongolia University,2017
[18]JINTING L,HONGXU H,JING W,et al.Combining Discrete Lexicon Probabilities with NMT for Low-Resource Mongolian-Chinese Translation[C]//2017 18th International Conference on Parallel and Distributed Computing,Applications and Technologies (PDCAT).IEEE,2017:104-111.
[19]REN Z,HOU H X,JI Y,et al.Application of sub-character segmentation in Mongolian-Chinese neural machine translation[J].Journal of Chinese Information Processing,2019,33(1):85-92.
[20]ZHANG J J,ZONG C Q.Exploiting source-side monolingual data in neural machine translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Proces-sing.Austin,Texas:Association for Computational Linguistics,2016:1535-1545.
[21]GIBADULLIN I,VALEEV A,KHUSAINOVA A,et al.A survey of methods to leverage monolingual data in low-resource neural machine translation[C]//Proceedings of the 2019 International Conference on Advanced Technologies and Humanitarian Sciences.Rabat,Morocco,2019.
[22]SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[C]//Procee-dings of the 54th Annual Meeting of the Association for Computational Linguistics.(Volume 1:Long Papers).Berlin,Germany:Association for Computational Linguistics,2016:86-96.
[23]PARK J,NA B,YOON S.Building a neural machine translation system using only synthetic parallel data[J].arXiv:1704.00253,2017.
[24]PONCELAS A,SHTERIONOV D,WAY A,et al.Investigating back-translation in neural machine translation[C]//Proceedings of the 21st Annual Conference of the European Association for Machine Translation.Alacant,Spain,2018:249-258.
[25]PONCELAS A,POPOVIC＇ M,SHTERIONOV D,et al.Combining SMT and NMT back-translated data for efficient NMT[C]//Proceedings of the 2019 Recent Advances in Natural Language Processing.Varna,Bulgaria,2019:922-931.
[26]EDUNOV S,OTT M,AULI M,et al.Understanding back-translation at scale[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:489-500.
[27]FADAEE M,MONZ C.Back-translation sampling by targeting difficult words in neural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:436-446.
[28]HOANG V C D,KOEHN P,HAFFARI G,et al.Iterative back-translation for neural machine translation[C]//Proceedings of the 2nd Workshop on Neural Machine Translation and Generation.Melbourne,Australia:Association for Computational Linguistics,2018:18-24.
[29]IMANKULOVA A,DABRE R,FUJITA A,et al.Exploitingout-of-domain parallel data through multilingual transfer lear-ning for low-resource neural machine translation[C]//Procee-dings of Machine Translation Summit XVII Volume 1:Research Track.Dublin,Ireland:European Association for Machine Translation,2019:128-139.
[30]IMANKULOVA A,SATO T,KOMACHI M.Improving low-resource neural machine translation with filtered pseudo-parallel corpus[C]//Proceedings of the 4th Workshop on Asian Translation (WAT2017).Taipei,China:Asian Federation of Natural Language Processing,2017:70-78.
[31]WU J W,WANG X,WANG W Y.Extract and edit:An alternative to back-translation for unsupervised neural machine translation[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).Minneapolis,Minnesota:Association for Computational Linguistics,2019:1173-1183.
[32]CURREY A,HEAFIELD K.Zero-resource neural machinetranslation with monolingual pivot data[C]//Proceedings of the 3rd Workshop on Neural Generation and Translation.Hong Kong,China:Association for Computational Linguistics,2019:99-107.
[33]BAI T G.Mongolian-Chinese Neural Network Machine Translation Based on Reinforcement Learning[D].Hohhot:Inner Mongolia University,2020.
[34]KNIGHT K,DANIEL M.Statistics-Based Summarization-Step One:Sentence Compression[C]//AAAI Conference on Artificial Intelligence.2000:703-710.
[35]FADAEE M,BISAZZA A,MONZ C.Data augmentation forlow-resource neural machine translation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers).Vancouver,Canada:Association for Computational Linguistics,2017:567-573.
[36]WANG X Y,PHAM H,DAI Z H,et al.SwitchOut:An efficient data augmentation algorithm for neural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Associationfor Computational Linguistics,2018:856-861.
[37]ARTETXE M,LABAKA G,AGIRRE E,et al.Unsupervisedneural machine translation[J].arXiv:1710.11041,2017.
[38]LAMPLE G,CONNEAU A,DENOYER L,et al.Unsupervised machine translation using monolingual corpora only[C]//Proceedings of the 6^th International Conference on Learning Representations.Vancouver,Canada,2018.
[39]LAMPLE G,OTT M,CONNEAU A,et al.Phrase-based &neural unsupervised machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:5039-5049.
[40]WU Z Y,HOU H X,GUO Z Y,et al.Mongolian-Chinese Unsupervised Neural Machine Translation with Lexical Feature[C]//China National Conference on Chinese Computational Linguistics.Cham:Springer,2019:334-345.
[41]CHENG Y,XU W,HE Z J,et al.Semi-supervised learning for neural machine translation[C]//Proceedings of the 54^th Annual Meeting of the Association for Computational Linguistics (Vo-lume 1:Long Papers).Berlin,Germany:Association for Computational Linguistics,2016:1965-1974.
[42]ZHENG Z X,ZHOU H,HUANG S J,et al.Mirror-generative neural machine translation[C]//Proceedings of the 8^th International Conference on Learning Representations.Addis Ababa,Ethiopia,2020.
[43]HE D,XIA Y,QIN T,et al.Dual learning for machine translation[C]//Advances in Neural Information Processing Systems.2016:820-828.
[44]SUN S,HOU H X,WU N,et al.Dual learning Mongolian-Chinese machine translation based on iterative knowledge refining[J].Journal of Xiamen University (Natural Science Edition),2021,60(4):687-692.
[45]WU L,TIAN F,QIN T,et al.A Study of Reinforcement Lear-ning for Neural Machine Translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3612-3621.
[46]KENESHLOO Y,SHI T,RAMAKRISHNAN N,et al.DeepReinforcement Learning For Sequence to Sequence Models[J].arXiv:1805.09461,2018.
[47]BAHDANAU D,BRAKEL P,XU K,et al.An actor-critic algorithm for sequence prediction[J].arXiv:1607.07086,2016.
[48]JI Y,HOU H,CHEN J J,et al.Training with Additional Semantic Constraints for Enhancing Neural Machine Translation[C]//16th Pacific Rim International Conference on Artificial Intelligence (PRICAI).2010:300-313.
[49]JI Y,HOU H,WU N,et al.Exploring the Advantages of Corpus in Neural Machine Translation of Agglutinative Language[C]//27th Artificial Neural Networks and Machine Learning(ICANN).2019:326-336.
[50]CHEN X,SUN Y,ATHIWARATKUN B,et al.Adversarial deep averaging networks for cross-lingual sentiment classification[C]//Association for Computational Linguistics.2016:557-570.
[51]WU L,XIA Y,ZHAO L,et al.Adversarial Neural Machine Translation[C]//Asian Conference on Machine Learning.2018:374-385.
[52]YU L T,ZHANG W N,WANG J,et al.SeqGAN:SequenceGenerative Adversarial Nets with Policy Gradient[C]//Associa-tion for the Advancement of Artificial Intelligence.2016:2852-2858.
[53]WU L,XIA Y,TIAN F.Adversarial neural machine translation[C]//Proceedings of the 10^th Asian Conference on Machine Learning,ACML 2018.2018:534-549.
[54]YANG Z,CHEN W,WANG F,et al.Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets[C]//North American Chapter of the Association for Computational Linguistics.2018:1346-1355.
[55]JI Y,HOU H,CHEN J.Improving Mongolian-Chinese Neural Machine Translation with Morphological Noise[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics:student (ACL).2019:123-129.
[56]MARC’AURELIO R,SUMIT C,MICHAEL A,et al.Sequence level training with recurrent neural networks[J].arXiv:1511.06732,2015.
[57]ZOPH B,YURET D,MAY J,et al.Transfer Learning for Low-Resource Neural Machine Translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:1568-1575.
[58]LAKEW S M,EROFEEVA A,NEGRI M,et al.TransferLearning in Multilingual Neural Machine Translation with Dynamic Vocabulary[C]//International Workshop on Spoken Language Translation.2018.
[59]LAKEW S M,KARAKANTA A,FEDERICO M,et al.Adapting Multilingual Neural Machine Translation to Unseen Languages[J].arXiv:1910.13998,2019.
[60]GU J T,YANG Y,CHEN Y,et al.Meta-Learning for Low-Resource Neural Machine Translation [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3622-3631.
[61]WANG Y F,SU Y L,ZHAO Y P,et al.Mongolian-Chinese neural machine translation model based on parameter migration[J].Computer Applications and Software,2020,37(9):87-93.
[62]WU N,HOU H X,ZHENG W,et al.Low-resource Neural Machine Translation based on Improved Reptile Meta-Learning Method[C]//The 17^th China Conference on Machine Translation.2021.
[63]SHAH H,BARBER D.Generative neural machine translation[J].arXiv:1806.05138,2018.
[64]GU J,BRADBURY J,XIONG C,et al.Non-autoregressive neural machine translation[J].arXiv:1711.02281,2017.
[65]HINTON G,VINYALS O,DEAN J.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39.
[66]KIM Y,RUSH A M.Sequence-level knowledge distillation[J].arXiv:1606.07947,2016.
[67]ZHOU C,NEUBIG G,GU J.Understanding knowledge distil-lation in non-autoregressive machine translation[J].arXiv:1911.02727,2019.
[68]WEI B,WANG M,ZHOU H,et al.Imitation learning for non-autoregressive neural machine translation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1304-1312.
[69]GUO J,TAN X,XU L,et al.Fine-tuning by curriculum learning for non-autoregressive neural machine translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020.
[70]LI Z,LIN Z,HE D,et al.Hint-based training for non-autoregressive translation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9^th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:5708-5713.
[71]KAISER L,BENGIO S,ROY A,et al.Fast decoding in sequence models using discrete latent variables[C]//Proceedings of the 35^th International Conference on Machine Learning.2018:2395-2404.
[72]ROY A,VASWANI A,NEELAKANTA A,et al.Theory and experiments on vector quantized autoencoders[J].arXiv:1805.11063,2018.
[73]MA X,ZHOU C,LI X,et al.FlowSeq:Non-autoregressive conditional sequence generation with generative flow[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9^th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:4273-4283.
[74]SHU R,LEE J,NAKAYAMA H,et al.Latent-variable non-autoregressive neural machine translation with deterministic inference using a delta posterior[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020.
[75]WANG Y,TIAN F,HE D,et al.Non-autoregressive machine translation with auxiliary regularization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020.
[76]SUN Z,LI Z,WANG H,et al.Fast structured decoding for sequence models[C]//Procedings of Advances in Neural Information Processing Systems.2019:3011-3020.
[77]SHAO C,ZHANG J,FENG Y,et al.Minimizing the bag-of-ngrams difference for non-autoregressive neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020.
[78]LEE J,MANSIMOV E,CHO K.Deterministic non-autoregressive neural sequence modeling by iterative refinement[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1173-1182.
[79]GHAZVININEJAD M,LEVY O,LIU Y,et al.Mask-predict:parallel decoding of conditional masked language models[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9^th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:6114-6123.
[80]GU J,WANG C,ZHAO J.Levenshtein transformer [C]//Procedings of the Neural Information Processing Systems.2019:11181-11191.
[81]GUO Z Y.Research on Mongolian-Chinese Neural Network Machine Translation based on generative method[D].Hohhot:Inner Mongolia University,2021.
[82]ZHU J,XIA Y,WU L,et al.Incorporating BERT into Neural Machine Translation[C]//8th International Conference on Learning Representations.2020.
[83]WENG R,YU H,HUANG S,et al.Acquiring Knowledge from Pre-trained Model to Neural Machine Translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:9266-9273.
[84]WU N,HOU H,GUO Z,et al.Low-Resource Neural Machine Translation Using XLNet Pre-training Model[C]//Procedings of the 30th International Conference on Artificial Neural Networks.2021:503-514.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed