计算机科学 ›› 2022, Vol. 49 ›› Issue (1): 24-30.doi: 10.11896/jsjkx.210800254
于东1, 谢婉莹1, 谷舒豪2,3, 冯洋2,3
YU Dong1, XIE Wan-ying1, GU Shu-hao2,3, FENG Yang2,3
摘要: 近年来,使用单一模型实现多语言神经机器翻译的方法受到了广泛关注。然而,现有方法多将所有语种语料直接混合作为训练语料,未能利用多种语言之间关联和相似的信息。此外,模型训练涉及语言种类多、数据量大、整体训练难度大、耗时长等问题。针对以上两个问题,文中提出了一种基于语种关联度的课程学习方法来提高多语言神经机器翻译的整体性能和收敛速度。具体来说,提出了两种度量语种关联度的指标:使用奇异向量典型相关分析对不同语言进行排序以及使用余弦相似度对特定语言中的不同句子进行排序。进一步,文中提出以验证集损失为课程替换标准的课程学习策略,使模型训练由整体训练转化为一系列课程上的训练,降低了训练难度。该方法填补了课程学习策略在多语言神经机器翻译领域的空白。文中在平衡和非平衡的IWSLT多语言数据集和Europarl语料库数据集上进行了实验,结果表明,所提方法优于多语言基线翻译系统,最多可使训练时间缩短64%。
中图分类号:
[1]KALCHBRENNER N,BLUNSOM P.Recurrent continuoustranslation models[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1700-1709. [2]AHARONI R,JOHNSON M,FIRAT O.Massively multilingual neural machine translation[J].arXiv:1903.00089,2019. [3]ARIVAZHAGAN N,BAPNA A,FIRAT O,et al.2019.Massively Multilingual Neural Machine Translation in the Wild:Findings and Challenges[J].arXiv:1907.05019,2019. [4]HA T L,NIEHUES J,WAIBEL A.Toward multilingual neural machine translation with universal encoder and decoder[J].ar-Xiv:1611.04798,2016. [5]XUE Q T,LI J H,GONG Z X.Multi-language unsupervisedneural machine translation[J].Journal of Xiamen University(Natural Science),2020,59(2):192-197. [6]FIRAT O,CHO K,BENGIO Y.Multi-Way,Multilingual Neural Machine Translation with a Shared Attention Mechanism[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:866-875. [7]JOHNSON M,SCHUSTER M,LE Q V,et al.Google's multilingual neural machine translation system:Enabling zero-shot translation[J].Transactions of the Association for Computational Linguistics,2017,5:339-351. [8]TAN X,CHEN J,HE D,et al.Multilingual Neural MachineTranslation with Language Clustering[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:963-973. [9]BENGIO Y,LOURADOUR J,COLLOBERT R,et al.Curriculum learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning.2009:41-48. [10]KOCMI T,BOJAR O.Curriculum Learning and MinibatchBucketing in Neural Machine Translation[C]//Proceedings of the International Conference Recent Advances in Natural Language Processing(RANLP 2017).2017:379-386. [11]LU Y,KEUNG P,LADHAK F,et al.A neural interlingua formultilingual machine translation[C]//Proceedings of the Third Conference on Machine Translation:Research Papers.2018:84-92. [12]GU J,HASSAN H,DEVLIN J,et al.Universal Neural MachineTranslation for Extremely Low Resource Languages[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1 (Long Papers).2018:344-354. [13]DONG D,WU H,HE W,et al.Multi-task learning for multiple language translation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers).2015:1723-1732. [14]WANG Y,ZHOU L,ZHANG J,et al.A compact and language-sensitive multilingual translation method[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1213-1223. [15]DABRE R,FUJITA A.Recurrent stacking of layers for compact neural machine translation models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):6292-6299. [16]WANG S,FAN Y X,GUO J F,et al.Dynamic Learning Method of Neural Machine Translation Based on Sample Difficulty[J].Journal of Guangxi Normal University(Natural Science Edition),2021,39(2):13-20. [17]WANG W,WATANABE T,HUGHES M,et al.DenoisingNeural Machine Translation Training with Trusted Data and Online Data Selection[C]//Proceedings of the Third Confe-rence on Machine Translation:Research Papers.2018:133-143. [18]KUMAR G,FOSTER G,CHERRY C,et al.ReinforcementLearning based Curriculum Optimization for Neural Machine Translation[C]//Proceedings of NAACL-HLT.2019:2054-2061. [19]PLATANIOS E A,STRETCU O,NEUBIG G,et al.Compe-tence-based Curriculum Learning for Neural Machine Translation[C]//Proceedings of NAACL-HLT.2019:1162-1172. [20]ZHANG X,SHAPIRO P,KUMAR G,et al.Curriculum Lear-ning for Domain Adaptation in Neural Machine Translation[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1 (Long and Short Papers).2019:1903-1915. [21]ZHAO M,WU H,NIU D,et al.Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(5):9652-9659. [22]GUO J,TAN X,XU L,et al.Fine-tuning by curriculum learning for non-autoregressive neural machine translation[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:7839-7846. [23]RAGHU M,GILMER J,YOSINSKI J,et al.SVCCA:Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability[C]//Advances in Neural Information Processing Systems 30:Annual Conference on Neural Information Processing Systems.2017:6076-6085. [24]KUDUGUNTA S,BAPNA A,CASWELL I,et al.Investigating Multilingual NMT Representations at Scale[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:1565-1575. [25]HARDOON D R,SZEDMAK S,SHAWE-TAYLOR J.Canonical correlation analysis:An overview with application to learning methods[J].Neural computation,2004,16(12):2639-2664. [26]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. |
[1] | 董振恒, 任维平, 游新冬, 吕学强. 融入新能源领域术语知识的机器翻译方法 Machine Translation Method Integrating New Energy Terminology Knowledge 计算机科学, 2022, 49(6): 305-312. https://doi.org/10.11896/jsjkx.210500117 |
[2] | 刘俊鹏, 苏劲松, 黄德根. 融合特定语言适配模块的多语言神经机器翻译 Incorporating Language-specific Adapter into Multilingual Neural Machine Translation 计算机科学, 2022, 49(1): 17-23. https://doi.org/10.11896/jsjkx.210900005 |
[3] | 侯宏旭, 孙硕, 乌尼尔. 蒙汉神经机器翻译研究综述 Survey of Mongolian-Chinese Neural Machine Translation 计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006 |
[4] | 刘妍, 熊德意. 面向小语种机器翻译的平行语料库构建方法 Construction Method of Parallel Corpus for Minority Language Machine Translation 计算机科学, 2022, 49(1): 41-46. https://doi.org/10.11896/jsjkx.210900012 |
[5] | 程高峰, 颜永红. 多语言语音识别声学模型建模方法最新进展 Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods 计算机科学, 2022, 49(1): 47-52. https://doi.org/10.11896/jsjkx.210900013 |
[6] | 刘创, 熊德意. 多语言问答研究综述 Survey of Multilingual Question Answering 计算机科学, 2022, 49(1): 65-72. https://doi.org/10.11896/jsjkx.210900003 |
[7] | 宁秋怡, 史小静, 段湘煜, 张民. 基于风格感知的无监督领域适应算法 Unsupervised Domain Adaptation Based on Style Aware 计算机科学, 2022, 49(1): 271-278. https://doi.org/10.11896/jsjkx.201200094 |
[8] | 刘小蝶. 基于边界感知的复杂名词短语的识别和转换研究 Recognition and Transformation for Complex Noun Phrases Based on Boundary Perception 计算机科学, 2021, 48(6A): 299-305. https://doi.org/10.11896/jsjkx.200500157 |
[9] | 郭丹, 唐申庚, 洪日昌, 汪萌. 手语识别、翻译与生成综述 Review of Sign Language Recognition, Translation and Generation 计算机科学, 2021, 48(3): 60-70. https://doi.org/10.11896/jsjkx.210100227 |
[10] | 周小诗, 张梓葳, 文娟. 基于神经网络机器翻译的自然语言信息隐藏 Natural Language Steganography Based on Neural Machine Translation 计算机科学, 2021, 48(11A): 557-564. https://doi.org/10.11896/jsjkx.210100015 |
[11] | 乔博文,李军辉. 融合语义角色的神经机器翻译 Neural Machine Translation Combining Source Semantic Roles 计算机科学, 2020, 47(2): 163-168. https://doi.org/10.11896/jsjkx.190100048 |
[12] | 纪明轩, 宋玉蓉. 一种基于对数位置表示和自注意力的机器翻译新模型 New Machine Translation Model Based on Logarithmic Position Representation and Self-attention 计算机科学, 2020, 47(11A): 86-91. https://doi.org/10.11896/jsjkx.200200003 |
[13] | 王坤, 段湘煜. 倾向近邻关联的神经机器翻译 Neural Machine Translation Inclined to Close Neighbor Association 计算机科学, 2019, 46(5): 198-202. https://doi.org/10.11896/j.issn.1002-137X.2019.05.030 |
[14] | 张爱英. 基于多语言语音数据选择的资源稀缺蒙语语音识别研究 Research on Low-resource Mongolian Speech Recognition Based on Multilingual Speech Data Selection 计算机科学, 2018, 45(9): 308-313. https://doi.org/10.11896/j.issn.1002-137X.2018.09.052 |
[15] | 汪琪, 段湘煜. 基于注意力卷积的神经机器翻译 Neural Machine Translation Based on Attention Convolution 计算机科学, 2018, 45(11): 226-230. https://doi.org/10.11896/j.issn.1002-137X.2018.11.035 |
|