计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241100022-10.doi: 10.11896/jsjkx.241100022

• 人工智能 • 上一篇    下一篇

面向航空手册的偏向性检索增强集成翻译模型

杨晨, 叶娜, 张桂平   

  1. 沈阳航空航天大学辽宁省自然语言处理重点实验室 沈阳 110136
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 叶娜(yena_1@126.com)
  • 基金资助:
    国家自然科学基金(U1908216)

Biased Retrieval-augmented Ensembling Translation Model for Aviation Manuals

YANG Chen, YE Na, ZHANG Guiping   

  1. Liaoning Provincial Key Laboratory of Artificial Intelligence and Natural Language Processing,Shenyang Aero-space University,Shenyang 110136,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation(U1908216).

摘要: 航空手册是指与大型民用飞机设计相关的出版物,包括飞行、维护、安全等手册。作为一种对语言表达的清晰度和精准度有极高要求的技术文档,航空手册的翻译要求译文符合简化技术英语规范(The Simplified Technical English Specification,STE)。简化技术英语是一种受控自然语言,对文档的语法和词汇使用提出了明确而严格的规则限制。为此,提出了一种STE引导的偏向性检索增强集成翻译模型(Biased Retrieval-augmented Ensembling Translation Model,BRAETM),模型内利用跨语言检索同类型且长度规范的偏向性目标语言序列指导解码端译文生成,同时采用STE Dictionary引导的规范词偏向性解码策略修正译文用词;模型外依据预估模块结果选择性集成非被动翻译模型,以此生成句式、语态、用词等方面更规范的译文。实验结果表明,提出的模型能够生成更符合简化技术英语规则的译文,相比先进的基线模型,在两个航空手册测试语料上的BLEU值分别提升了3.60和2.67。

关键词: 神经机器翻译, 简化技术英语, 偏向性翻译记忆, STE Dictionary

Abstract: The aviation manuals refer to publications related to the design of large civil aircrafts,including flight manuals,maintenance manuals,and safety manuals.As a type of technical documentation that demands a high level of clarity and precision in language expression,the translation of aviation manuals requires adherence to the Simplified Technical English Specification(STE).STE is a controlled natural language that imposes explicit and stringent rules on the use of grammar and vocabulary in documentation.This paper proposes a biased retrieval-augmented ensembling translation model(BRAETM) for aviation manuals guided by STE.Within the model,biased target language sequences with the same sentence type and with lengths that meet the specification are cross-lingually retrieved to guide the translation generation at the decoder end,and a biased decoding strategy guided by the STE dictionary is adopted to correct the words in the translation.Outside the model,a non-passive translation model is selectively ensembled according to the estimation results of a prediction module,in order to generate more standardized translations in terms of sentence structure,voice and vocabulary.Experimental results show that the proposed model can generate translations that better adhere to the STE rules.Compared to the state-of-the-art baseline models,the BLEU scores of this model on two aviation manual test corpora are improved by 3.60 and 2.67,respectively.

Key words: Neural machine translation, Simplified technical English, Biased translation memory, STE Dictionary

中图分类号: 

  • TP391
[1]XUE P.Applications and processing of controlled natural language[J].Journal of Chinese Information Processing,2018,32(10):1-10.
[2]ASD(Aerospace and Defence Industries Association of Europe).Simplified Technical English,Specification ASD-STE100:017966390[S].ASD,2025.
[3]ZHANG J,TIAN Y,SONG Z,et al.A survey of neural machine translation[J].Computer Engineering and Applications,2024,60(1043):57-74.
[4]ALIM S,SILAJIT R,MAHEPU R,et al.Research on the sensitivity of neural machine translation to sentence length[J].Computer Engineering and Applications,2022,58(1000):195-200.
[5]POST M,VILAR D.Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:1314-1324.
[6]XIAO N,JIN C,DUAN X.Unsupervised domain adaptation machine translation based on improving the quality of pseudo-parallel sentence pairs[J].Computer Engineering and Science,2022,44(12):2230.
[7]ROMANOV A,RUMSHISKY A,ROGERS A,et al.Adversarial decomposition of text representation[J].arXiv:1808.09042,2018.
[8]SHEN T,LEI T,BARZILAY R,et al.Style transfer from non-parallel text by cross-alignment[C]//Advances in Neural Information Processing Systems.2017.
[9]DAI N,LIANG J,QIU X,et al.Style transformer:Unpairedtext style transfer without disentangled latent representation[J].arXiv:1905.05621,2019.
[10]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9.
[11]CAO Q,XIONG D.Encoding gated translation memory intoneural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3042-3047.
[12]HE Q,HUANG G,CUI Q,et al.Fast and accurate neural machine translation with translation memory[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:3170-3180.
[13]COPESTAKE A.Implementing typed feature structure gram-mars[M].Stanford:CSLI Publications,2002.
[14]DEVLIN J.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[15]FENG Y,ZHANG S,ZHANG A,et al.Memory-augmentedneural machine translation[J].arXiv:1708.02005,2017.
[16]GU J,WANG Y,CHO K,et al.Search engine guided neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[17]BULTE B,TEZCAN A.Neural fuzzy repair:Integrating fuzzy matches into neural machine translation[C]//57th Annual Meeting of the Association for Computational Linguistics(ACL).2019:1800-1809.
[18]XU J,CREGO J M,SENELLART J.Boosting neural machine translation with similar translations[C]//Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics.2020:1570-1579.
[19]CAI D,WANG Y,LI H,et al.Neural machine translation with monolingual translation memory[J].arXiv:2105.11269,2021.
[20]CHENG X,GAO S,LIU L,et al.Neural machine translationwith contrastive translation memories[J].arXiv:2212.03140,2022.
[21]HAO H,HUANG G,LIU L,et al.Rethinking translation me-mory augmented neural machine translation[J].arXiv:2306.06948,2023.
[22]XIA M,HUANG G,LIU L,et al.Graph based translation me-mory for neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7297-7304.
[23]HASLER E,DE GISPERT A,IGLESIAS G,et al.Neural Ma-chine Translation Decoding with Terminology Constraints[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:506-512.
[24]HOKAMP C,LIU Q.Lexically Constrained Decoding for Se-quence Generation Using Grid Beam Search[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1535-1546.
[25]HU J E,KHAYRALLAH H,CULKIN R,et al.Improved lexically constrained decoding for translation and monolingual re-writing[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:839-850.
[26]CHEN G,CHEN Y,LIV O K.Lexically constrained neural machine translation with explicit alignment guidance[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:12630-12638.
[27]WANG S,TAN Z,LIU Y.Integrating Vectorized Lexical Con-straints for Neural Machine Transla-tion[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:7063-7073.
[28]JIN D,JIN Z,HU Z,et al.Deep learning for text style transfer:A survey[J].Computational Linguistics,2022,48(1):155-205.
[29]ROMANOV A,RUMSHISKY A,ROGERS A,et al.Adversarial decomposition of text representation[J].arXiv:1808.09042,2018.
[30]ALBANIE S,EHRHARDT S,HENRIQUES J F.Stopping gan violence:Generative unadversarial networks[J].arXiv:1703.02528,2017.
[31]DAI N,LIANG J,QIU X,et al.Style transformer:Unpairedtext style transfer without disentangled latent representation[J].arXiv:1905.05621,2019.
[32]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017.
[33]WANG K,HUA H,WAN X.Controllable unsupervised text attribute transfer via editing entangled latent representation[C]//Advances in Neural Information Processing Systems.2019.
[34]LI X,CHEN G,LIN C,et al.DGST:a dual-generator network for text style transfer[J].arXiv:2010.14557,2020.
[35]LIU A,WANG A,OKAZAKI N.Semi-supervised formalitystyle transfer with consistency training[J].arXiv:2203.13620,2022.
[36]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th annual meeting of the Association for Computational Linguistics.2002:311-318.
[37]SELLAM T,DAS D,PARIKHA P.BLEURT:Learning robust metrics for text generation[J].arXiv:2004.04696,2020.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!