计算机科学 ›› 2022, Vol. 49 ›› Issue (1): 31-40.doi: 10.11896/jsjkx.210900006
侯宏旭, 孙硕, 乌尼尔
HOU Hong-xu, SUN Shuo, WU Nier
摘要: 机器翻译是利用计算机将一种语言转换成另一种语言的过程,凭借着对语义的深度理解能力,神经机器翻译已经成为目前主流的机器翻译方法,在众多拥有大规模对齐语料的翻译任务上取得了令人瞩目的成就,然而对于一些低资源语言的翻译任务效果仍不理想。蒙汉机器翻译是目前国内主要的低资源机器翻译研究之一,蒙汉两种语言的翻译并不简单地是两种语言的相互转换,更是两个民族之间的交流,因此受到国内外的广泛关注。文中主要对蒙汉神经机器翻译的发展历程和研究现状进行阐述,随后选取了近年来蒙汉神经机器翻译研究的前沿方法,包括基于无监督学习和半监督学习的数据增强方法、强化学习方法、对抗学习方法、迁移学习方法和预训练模型辅助的神经机器翻译方法等,并对这些方法进行了简要介绍。
中图分类号:
[1]LI Y C,XIONG D Y,ZHANG M.A Survey of Neural Machine Translation[J].Chinese Journal of Computer,2018,12(1):1-25. [2]NASHUN W,LIU Q,BADAMA O.Chinese-Mongolian Ma-chine Aided Translation System[J].Journal of the Altai Society of Korea,2001,11. [3]LOPEZ A.Statistical Machine Translation[J].ACM Computing Surveys,2008,40(3):8. [4]CHIANG D.A hierarchical phrase-based model for statisticalmachine translation[C]//Proceedings of the Conference 43rd Annual Meeting of the Association for Computational Linguistics.2005:25-30. [5]ZONG C Q.Statistical Natural Language Processing[M].Tsinghua University Press,2013. [6]YIN H,WANG S R,GU L,et al.Design and implementation of a phrase-based statistical machine translation system for Mongolian-Chinese[J].Inner Mongolia Normal University (Natural Science Chinese Edition),2011,40(1):91-94. [7]WU J,HOU H,CONGJIAO X.Realignment from Finer-grained Alignment to Coarser-grained Alignment to Enhance Mongolian-Chinese SMT[C]//Proceedings of the 29th Pacific Asia Confe-rence on Language,Information and Computation:Posters.2015:89-95. [8]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequencelearning with neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112. [9]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//3rd International Conference on Learning Representations(ICLR 2015).2015. [10]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421. [11]GEHRING J,AULI M,GRANGIER D,et al.Convolutional Sequence to Sequence Learning[C]//International Conference on Machine Learning.2017:1243-1252. [12]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [13]FAN W T,HOU H X,WANG H B,et al.Mongolian-Chinese neural network machine translation model fused with prior information[J].Journal of Chinese Information Processing,2018,32(6):36-43. [14]LIU W W,SU Y L,WU N,et al.Research on Mongolian-Chinese machine translation based on LSTM[J].Computer Engineering and Science,2018,40(10):1890-1896. [15]WU N,SU Y L,LIU W W,et al.Mongolian-Chinese Machine Translation Based on CNN Etyma Morphological Selection Model[J].Journal of Chinese Information Processing,2018,32(5):42-48. [16]WANG H B.Research on Multi-granularity Mongolian-Chinese Neural Network Machine Translation[D].Hohhot:Inner Mongolia University,2018. [17]WU J.Research of optimization methods integration and translation rerank for Mongolian-Chinese machine translation [D].Hohhot:Inner Mongolia University,2017 [18]JINTING L,HONGXU H,JING W,et al.Combining Discrete Lexicon Probabilities with NMT for Low-Resource Mongolian-Chinese Translation[C]//2017 18th International Conference on Parallel and Distributed Computing,Applications and Technologies (PDCAT).IEEE,2017:104-111. [19]REN Z,HOU H X,JI Y,et al.Application of sub-character segmentation in Mongolian-Chinese neural machine translation[J].Journal of Chinese Information Processing,2019,33(1):85-92. [20]ZHANG J J,ZONG C Q.Exploiting source-side monolingual data in neural machine translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Proces-sing.Austin,Texas:Association for Computational Linguistics,2016:1535-1545. [21]GIBADULLIN I,VALEEV A,KHUSAINOVA A,et al.A survey of methods to leverage monolingual data in low-resource neural machine translation[C]//Proceedings of the 2019 International Conference on Advanced Technologies and Humanitarian Sciences.Rabat,Morocco,2019. [22]SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[C]//Procee-dings of the 54th Annual Meeting of the Association for Computational Linguistics.(Volume 1:Long Papers).Berlin,Germany:Association for Computational Linguistics,2016:86-96. [23]PARK J,NA B,YOON S.Building a neural machine translation system using only synthetic parallel data[J].arXiv:1704.00253,2017. [24]PONCELAS A,SHTERIONOV D,WAY A,et al.Investigating back-translation in neural machine translation[C]//Proceedings of the 21st Annual Conference of the European Association for Machine Translation.Alacant,Spain,2018:249-258. [25]PONCELAS A,POPOVIC' M,SHTERIONOV D,et al.Combining SMT and NMT back-translated data for efficient NMT[C]//Proceedings of the 2019 Recent Advances in Natural Language Processing.Varna,Bulgaria,2019:922-931. [26]EDUNOV S,OTT M,AULI M,et al.Understanding back-translation at scale[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:489-500. [27]FADAEE M,MONZ C.Back-translation sampling by targeting difficult words in neural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:436-446. [28]HOANG V C D,KOEHN P,HAFFARI G,et al.Iterative back-translation for neural machine translation[C]//Proceedings of the 2nd Workshop on Neural Machine Translation and Generation.Melbourne,Australia:Association for Computational Linguistics,2018:18-24. [29]IMANKULOVA A,DABRE R,FUJITA A,et al.Exploitingout-of-domain parallel data through multilingual transfer lear-ning for low-resource neural machine translation[C]//Procee-dings of Machine Translation Summit XVII Volume 1:Research Track.Dublin,Ireland:European Association for Machine Translation,2019:128-139. [30]IMANKULOVA A,SATO T,KOMACHI M.Improving low-resource neural machine translation with filtered pseudo-parallel corpus[C]//Proceedings of the 4th Workshop on Asian Translation (WAT2017).Taipei,China:Asian Federation of Natural Language Processing,2017:70-78. [31]WU J W,WANG X,WANG W Y.Extract and edit:An alternative to back-translation for unsupervised neural machine translation[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).Minneapolis,Minnesota:Association for Computational Linguistics,2019:1173-1183. [32]CURREY A,HEAFIELD K.Zero-resource neural machinetranslation with monolingual pivot data[C]//Proceedings of the 3rd Workshop on Neural Generation and Translation.Hong Kong,China:Association for Computational Linguistics,2019:99-107. [33]BAI T G.Mongolian-Chinese Neural Network Machine Translation Based on Reinforcement Learning[D].Hohhot:Inner Mongolia University,2020. [34]KNIGHT K,DANIEL M.Statistics-Based Summarization-Step One:Sentence Compression[C]//AAAI Conference on Artificial Intelligence.2000:703-710. [35]FADAEE M,BISAZZA A,MONZ C.Data augmentation forlow-resource neural machine translation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers).Vancouver,Canada:Association for Computational Linguistics,2017:567-573. [36]WANG X Y,PHAM H,DAI Z H,et al.SwitchOut:An efficient data augmentation algorithm for neural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Associationfor Computational Linguistics,2018:856-861. [37]ARTETXE M,LABAKA G,AGIRRE E,et al.Unsupervisedneural machine translation[J].arXiv:1710.11041,2017. [38]LAMPLE G,CONNEAU A,DENOYER L,et al.Unsupervised machine translation using monolingual corpora only[C]//Proceedings of the 6th International Conference on Learning Representations.Vancouver,Canada,2018. [39]LAMPLE G,OTT M,CONNEAU A,et al.Phrase-based &neural unsupervised machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:5039-5049. [40]WU Z Y,HOU H X,GUO Z Y,et al.Mongolian-Chinese Unsupervised Neural Machine Translation with Lexical Feature[C]//China National Conference on Chinese Computational Linguistics.Cham:Springer,2019:334-345. [41]CHENG Y,XU W,HE Z J,et al.Semi-supervised learning for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Vo-lume 1:Long Papers).Berlin,Germany:Association for Computational Linguistics,2016:1965-1974. [42]ZHENG Z X,ZHOU H,HUANG S J,et al.Mirror-generative neural machine translation[C]//Proceedings of the 8th International Conference on Learning Representations.Addis Ababa,Ethiopia,2020. [43]HE D,XIA Y,QIN T,et al.Dual learning for machine translation[C]//Advances in Neural Information Processing Systems.2016:820-828. [44]SUN S,HOU H X,WU N,et al.Dual learning Mongolian-Chinese machine translation based on iterative knowledge refining[J].Journal of Xiamen University (Natural Science Edition),2021,60(4):687-692. [45]WU L,TIAN F,QIN T,et al.A Study of Reinforcement Lear-ning for Neural Machine Translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3612-3621. [46]KENESHLOO Y,SHI T,RAMAKRISHNAN N,et al.DeepReinforcement Learning For Sequence to Sequence Models[J].arXiv:1805.09461,2018. [47]BAHDANAU D,BRAKEL P,XU K,et al.An actor-critic algorithm for sequence prediction[J].arXiv:1607.07086,2016. [48]JI Y,HOU H,CHEN J J,et al.Training with Additional Semantic Constraints for Enhancing Neural Machine Translation[C]//16th Pacific Rim International Conference on Artificial Intelligence (PRICAI).2010:300-313. [49]JI Y,HOU H,WU N,et al.Exploring the Advantages of Corpus in Neural Machine Translation of Agglutinative Language[C]//27th Artificial Neural Networks and Machine Learning(ICANN).2019:326-336. [50]CHEN X,SUN Y,ATHIWARATKUN B,et al.Adversarial deep averaging networks for cross-lingual sentiment classification[C]//Association for Computational Linguistics.2016:557-570. [51]WU L,XIA Y,ZHAO L,et al.Adversarial Neural Machine Translation[C]//Asian Conference on Machine Learning.2018:374-385. [52]YU L T,ZHANG W N,WANG J,et al.SeqGAN:SequenceGenerative Adversarial Nets with Policy Gradient[C]//Associa-tion for the Advancement of Artificial Intelligence.2016:2852-2858. [53]WU L,XIA Y,TIAN F.Adversarial neural machine translation[C]//Proceedings of the 10th Asian Conference on Machine Learning,ACML 2018.2018:534-549. [54]YANG Z,CHEN W,WANG F,et al.Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets[C]//North American Chapter of the Association for Computational Linguistics.2018:1346-1355. [55]JI Y,HOU H,CHEN J.Improving Mongolian-Chinese Neural Machine Translation with Morphological Noise[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics:student (ACL).2019:123-129. [56]MARC’AURELIO R,SUMIT C,MICHAEL A,et al.Sequence level training with recurrent neural networks[J].arXiv:1511.06732,2015. [57]ZOPH B,YURET D,MAY J,et al.Transfer Learning for Low-Resource Neural Machine Translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:1568-1575. [58]LAKEW S M,EROFEEVA A,NEGRI M,et al.TransferLearning in Multilingual Neural Machine Translation with Dynamic Vocabulary[C]//International Workshop on Spoken Language Translation.2018. [59]LAKEW S M,KARAKANTA A,FEDERICO M,et al.Adapting Multilingual Neural Machine Translation to Unseen Languages[J].arXiv:1910.13998,2019. [60]GU J T,YANG Y,CHEN Y,et al.Meta-Learning for Low-Resource Neural Machine Translation [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3622-3631. [61]WANG Y F,SU Y L,ZHAO Y P,et al.Mongolian-Chinese neural machine translation model based on parameter migration[J].Computer Applications and Software,2020,37(9):87-93. [62]WU N,HOU H X,ZHENG W,et al.Low-resource Neural Machine Translation based on Improved Reptile Meta-Learning Method[C]//The 17th China Conference on Machine Translation.2021. [63]SHAH H,BARBER D.Generative neural machine translation[J].arXiv:1806.05138,2018. [64]GU J,BRADBURY J,XIONG C,et al.Non-autoregressive neural machine translation[J].arXiv:1711.02281,2017. [65]HINTON G,VINYALS O,DEAN J.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39. [66]KIM Y,RUSH A M.Sequence-level knowledge distillation[J].arXiv:1606.07947,2016. [67]ZHOU C,NEUBIG G,GU J.Understanding knowledge distil-lation in non-autoregressive machine translation[J].arXiv:1911.02727,2019. [68]WEI B,WANG M,ZHOU H,et al.Imitation learning for non-autoregressive neural machine translation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1304-1312. [69]GUO J,TAN X,XU L,et al.Fine-tuning by curriculum learning for non-autoregressive neural machine translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020. [70]LI Z,LIN Z,HE D,et al.Hint-based training for non-autoregressive translation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:5708-5713. [71]KAISER L,BENGIO S,ROY A,et al.Fast decoding in sequence models using discrete latent variables[C]//Proceedings of the 35th International Conference on Machine Learning.2018:2395-2404. [72]ROY A,VASWANI A,NEELAKANTA A,et al.Theory and experiments on vector quantized autoencoders[J].arXiv:1805.11063,2018. [73]MA X,ZHOU C,LI X,et al.FlowSeq:Non-autoregressive conditional sequence generation with generative flow[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:4273-4283. [74]SHU R,LEE J,NAKAYAMA H,et al.Latent-variable non-autoregressive neural machine translation with deterministic inference using a delta posterior[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020. [75]WANG Y,TIAN F,HE D,et al.Non-autoregressive machine translation with auxiliary regularization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020. [76]SUN Z,LI Z,WANG H,et al.Fast structured decoding for sequence models[C]//Procedings of Advances in Neural Information Processing Systems.2019:3011-3020. [77]SHAO C,ZHANG J,FENG Y,et al.Minimizing the bag-of-ngrams difference for non-autoregressive neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020. [78]LEE J,MANSIMOV E,CHO K.Deterministic non-autoregressive neural sequence modeling by iterative refinement[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1173-1182. [79]GHAZVININEJAD M,LEVY O,LIU Y,et al.Mask-predict:parallel decoding of conditional masked language models[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:6114-6123. [80]GU J,WANG C,ZHAO J.Levenshtein transformer [C]//Procedings of the Neural Information Processing Systems.2019:11181-11191. [81]GUO Z Y.Research on Mongolian-Chinese Neural Network Machine Translation based on generative method[D].Hohhot:Inner Mongolia University,2021. [82]ZHU J,XIA Y,WU L,et al.Incorporating BERT into Neural Machine Translation[C]//8th International Conference on Learning Representations.2020. [83]WENG R,YU H,HUANG S,et al.Acquiring Knowledge from Pre-trained Model to Neural Machine Translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:9266-9273. [84]WU N,HOU H,GUO Z,et al.Low-Resource Neural Machine Translation Using XLNet Pre-training Model[C]//Procedings of the 30th International Conference on Artificial Neural Networks.2021:503-514. |
[1] | 曹晓雯, 梁美玉, 鲁康康. 基于细粒度语义推理的跨媒体双路对抗哈希学习模型 Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model 计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011 |
[2] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[3] | 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148 |
[4] | 方义秋, 张震坤, 葛君伟. 基于自注意力机制和迁移学习的跨领域推荐算法 Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning 计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011 |
[5] | 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174 |
[6] | 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100 |
[7] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[8] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[9] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[10] | 王君锋, 刘凡, 杨赛, 吕坦悦, 陈峙宇, 许峰. 基于多源迁移学习的大坝裂缝检测 Dam Crack Detection Based on Multi-source Transfer Learning 计算机科学, 2022, 49(6A): 319-324. https://doi.org/10.11896/jsjkx.210500124 |
[11] | 郭雨欣, 陈秀宏. 融合BERT词嵌入表示和主题信息增强的自动摘要模型 Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement 计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101 |
[12] | 赵丹丹, 黄德根, 孟佳娜, 董宇, 张攀. 基于BERT-GRU-ATT模型的中文实体关系分类 Chinese Entity Relations Classification Based on BERT-GRU-ATT 计算机科学, 2022, 49(6): 319-325. https://doi.org/10.11896/jsjkx.210600123 |
[13] | 范静宇, 刘全. 基于随机加权三重Q学习的异策略最大熵强化学习算法 Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning 计算机科学, 2022, 49(6): 335-341. https://doi.org/10.11896/jsjkx.210300081 |
[14] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[15] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
|