融合特定语言适配模块的多语言神经机器翻译

doi:10.11896/jsjkx.210900005

Abstract

Abstract: Multilingual neural machine translation (mNMT) leverages a single encoder-decoder model for translations in multiple language pairs.mNMT can encourage knowledge transfer among related languages,improve low-resource translation and enable zero-shot translation.However,the existing mNMT models are weak in modeling language diversity and perform poor zero-shot translation.To solve the above problems,we first propose a variable dimension bilingual adapter based on the existing adapter architecture.The bilingual adapters are introduced in-between each two Transformer sub-layers to extract language-pair-specific features and the language-pair-specific capacity in the encoder or the decoder can be altered by changing the inner dimension of adapters.We then propose a shared monolingual adapter to model unique features for each language.Experiments on IWSLT dataset show that the proposed model remarkably outperforms the multilingual baseline model and the monolingual adapter can improve the zero-shot translation without deteriorating the performance of multilingual translation.

Key words: Bilingual adapter, Language-specific modeling, Monolingual adapter, Multilingual neural machine translation

CLC Number:

TP391

LIU Jun-peng, SU Jin-song, HUANG De-gen. Incorporating Language-specific Adapter into Multilingual Neural Machine Translation[J].Computer Science, 2022, 49(1): 17-23.

References

[1]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[2]JOHNSON M,SCHUSTER M,LE Q V,et al.Google's Multilingual Neural Machine Translation System:Enabling zero-shot Translation[J].Transactions of the Association for Computational Linguistics,2017,5:339-351.
[3]BAPNA A,ARIVAZHAGAN N,FIRAT O.Simple,ScalableAdaptation for Neural Machine Translation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Hong Kong:ACL,2019:1538-1548.
[4]WANG Y N,ZHANG J J,ZHAI F F,et al.Three Strategies to Improve One-to-Many Multilingual Translation[C]//Procee-dings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels:ACL,2018:2955-2960.
[5]SACHAN D C,NEUBIG G.Parameter Sharing Methods forMultilingual Self-Attentional Translation Models[C]//Procee-dings of the Third Conference on Machine Translation:Research Papers.Brussels:ACL,2018:261-271.
[6]PLATANIOS E A,SACHAN M,NEUBIG G,et al.Contextual Parameter Generation for Universal Neural Machine Translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels:ACL,2018:425-435.
[7]TAN X,CHEN J L,HE D,et al.Multilingual Neural Machine Translation with Language Clustering[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Hong Kong:ACL,2019:963-973.
[8]WANG Y N,ZHOU L,ZHANG J J,et al.A Compact and Language-Sensitive Multilingual Translation Method[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence:ACL,2019:1213-1223.
[9]ZHANG B,WILLIAMS P,TITOV I,et al.Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Online:ACL.2020:1628-1639.
[10]ZHANG B,BAPNA A,SENNRICH R,et al.Share or not?Learning to Schedule Language-specific Capacity for Multilingual Translation[C]//International Conference on Learning Representations.2021.
[11]GU J T,WANG Y,CHO K,et al.Improved zero-shot NeuralMachine Translation via Ignoring Spurious Correlations[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence:ACL,2019:1258-1268.
[12]ARIVAZHAGAN N,BAPNA A,FIRAT O,et al.The Missing Ingredient in zero-shot Neural Machine Translation[J] arXiv:1903.07091.[13]CURREY A,HEAFIELD K.Zero-resource Neural MachineTranslation with Monolingual Pivot Data[C]//Proceedings of the 3rd Workshop in Neural Generation and Translation.Hong Kong:ACL,2019:99-107.
[14]FIRAT O,SANKARAN B,AL-ONAIZAN Y,et al.Zero-resource Translation with Multi-lingual Neural Machine Translation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin:ACL,2016:268-277.
[15]AL-SHEDIVAT M,PARIKH A.Consistency by Agreement in zero-shot Neural Machine Translation[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Minneapolis:ACL,2019:1184-1197.
[16]PHILIP J,BÉRARD A,GALLÉ M,et al.Monolingual Adapters for Zero-shot Neural Machine Translation[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.Online:ACL,2020:4465-4470.
[17]SENNRICH R,HADDOW B,BIRCH A.Neural MachineTranslation of Rare Words with Subword Units[C]//Procee-dings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin:ACL,2016:1715-1725.
[18]KING D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980.
[19]POST M.A Call for Clarity in Reporting BLEU Scores[C]//Proceedings of the Third Conference on Machine Translation:Research Papers.Brussels:ACL,2019:186-191.

Related Articles 15

[1]	CHEN Zhi-qiang, HAN Meng, LI Mu-hang, WU Hong-xin, ZHANG Xi-long. Survey of Concept Drift Handling Methods in Data Streams [J]. Computer Science, 2022, 49(9): 14-32.
[2]	WANG Ming, WU Wen-fang, WANG Da-ling, FENG Shi, ZHANG Yi-fei. Generative Link Tree:A Counterfactual Explanation Generation Approach with High Data Fidelity [J]. Computer Science, 2022, 49(9): 33-40.
[3]	ZHANG Jia, DONG Shou-bin. Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer [J]. Computer Science, 2022, 49(9): 41-47.
[4]	ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[5]	SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[6]	CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[7]	ZHENG Wen-ping, LIU Mei-lin, YANG Gui. Community Detection Algorithm Based on Node Stability and Neighbor Similarity [J]. Computer Science, 2022, 49(9): 83-91.
[8]	LYU Xiao-feng, ZHAO Shu-liang, GAO Heng-da, WU Yong-liang, ZHANG Bao-qi. Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network [J]. Computer Science, 2022, 49(9): 92-100.
[9]	XU Tian-hui, GUO Qiang, ZHANG Cai-ming. Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance [J]. Computer Science, 2022, 49(9): 101-110.
[10]	NIE Xiu-shan, PAN Jia-nan, TAN Zhi-fang, LIU Xin-fang, GUO Jie, YIN Yi-long. Overview of Natural Language Video Localization [J]. Computer Science, 2022, 49(9): 111-122.
[11]	CAO Xiao-wen, LIANG Mei-yu, LU Kang-kang. Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model [J]. Computer Science, 2022, 49(9): 123-131.
[12]	ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[13]	DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[14]	QU Qian-wen, CHE Xiao-ping, QU Chen-xin, LI Jin-ru. Study on Information Perception Based User Presence in Virtual Reality [J]. Computer Science, 2022, 49(9): 146-154.
[15]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Incorporating Language-specific Adapter into Multilingual Neural Machine Translation

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0