计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 198-202.doi: 10.11896/j.issn.1002-137X.2019.05.030

• 人工智能 • 上一篇    下一篇

倾向近邻关联的神经机器翻译

王坤, 段湘煜   

  1. (苏州大学计算机科学与技术学院 江苏 苏州215006)
  • 收稿日期:2018-04-18 修回日期:2018-09-01 发布日期:2019-05-15
  • 作者简介:王 坤(1994-),男,硕士生,主要研究方向为自然语言处理、机器翻译,E-mail:liesun1994@gmail.com;段湘煜(1976-),男,副教授,主要研究方向为自然语言处理、机器翻译,E-mail:xiangyuduan@suda.edu.cn(通信作者)。
  • 基金资助:
    国家自然科学基金(61673289),国家重点研发计划“政府间国际科技创新合作”重点专项(2016YFE0132100)资助。

Neural Machine Translation Inclined to Close Neighbor Association

WANG Kun, DUAN Xiang-yu   

  1. (School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Received:2018-04-18 Revised:2018-09-01 Published:2019-05-15

摘要: 现有神经机器翻译模型在对序列建模时,仅考虑目标端对应源端的关联性,未对源端关联性及目标端关联性建模。文中分别对源端以及目标端关联性建模,并设计合理的损失函数,使得源端隐藏层与其近邻K个单词隐藏层更相关,目标端隐藏层与其历史M个单词隐藏层更相关。在大规模中英数据集上的实验结果表明,相比于神经机器翻译中仅考虑目标端对应源端的关联性,所提方法可以构建更好的近邻关联表示,提升机器翻译系统的译文质量。

关键词: 机器翻译, 近邻关联, 注意力机制

Abstract: The existing neural machine translation model only considers the relevance of the target end corresponding to the source end when modeling the sequences,and does not model the source end association and the target end association.In this paper,the source and target associations were modeled separately,and a reasonable loss function was designed.The source-hidden layer is more related to its neighboring K word-hidden layers.The target-side hidden layer is more related to its historical M word-hidden layers.The experimental results on the large-scale Chinese-English dataset show that compared with the neural machine translation which only considers the relevance of the target end to the source,the proposed method can construct a better neighbor correlation representation and improve the translation qua-lity of the machine translation system.

Key words: Attention machanism, Close neighbor association, Machine translation

中图分类号: 

  • TP391
[1]LI Y C,XIONG D Y,ZHANG M.A survey of neural machine translation[OL].http://cjc.ict.ac.cn/online/bfpub/lyc-20171229152034.pdf.
[2]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J]. arXiv preprint arXiv:1409.0473,2014.
[3]SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[C]∥Advances in Neural Information Processing Systems.2014:3104-3112.
[4]LUONGM T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[J].arXiv preprint arXiv:1508.04025,2015.
[5]LIUL M,UTIYAMA M,FINCH A,et al.Neural MachineTranslationwith Supervised Attention[J].arXiv preprint arXiv:1609.04186,2016.
[6]MI H T,WANG Z G,ITTYCHERIAH A.Supervised Atten-tions for Neural Machine Translation[J].arXiv preprint arXiv:1608.00112,2016.
[7]CHEN W H,MATUSOV E,KHADIVI S,et al.Guided Alignment Training for Topic-aware Neural Machine Translation[J].arXiv preprint arXiv:1607.01628,2016.
[8]GEHRING J,AULI M,GRANGIER D,et al.ConvolutionalSequence to Sequence Learning[J].arXiv preprint arXiv:1705.03122,2017.
[9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need [C]∥Advances in Neural Information Processing Systems.2017:5998-6008.
[10]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vectorspace[J].arXiv preprint arXiv:1301.3781,2013.
[11]HOCHREITER S,SCHMIDHUBER J.Long Short-term Me-mory[J].Neural Computation,1997,9(8):1735-1780.
[12]CHO K,VAN MERRIENBOER B,GULERHRE C,et al.
Learning Phrase Representations using RNN Encoder-decoder for Statistical Machine Translation [J].arXiv preprint arXiv:1406.1078,2014.
[13]NEUBIG G,DYER C,GOLDBERG Y,et al.Dynet:The Dy-namic Neural Network Toolkit[J].arXiv preprint arXiv:1701.03980,2017.
[14]NEUBIG G.lamtram:A Toolkit for Language and Translation Modeling using Neural Networks[OL].http://www.github.com/neubig/lamtram.
[15]KINGMA D,BA J.Adam:A Method For Stochastic Optimization[J].arXiv preprint arXiv:1412.6980,2014.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[8] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[9] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[10] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚.
融合双向门控循环单元和注意力机制的软件自承认技术债识别方法
Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism
计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[13] 彭双, 伍江江, 陈浩, 杜春, 李军.
基于注意力神经网络的对地观测卫星星上自主任务规划方法
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!