计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 198-202.doi: 10.11896/j.issn.1002-137X.2019.05.030
王坤, 段湘煜
WANG Kun, DUAN Xiang-yu
摘要: 现有神经机器翻译模型在对序列建模时,仅考虑目标端对应源端的关联性,未对源端关联性及目标端关联性建模。文中分别对源端以及目标端关联性建模,并设计合理的损失函数,使得源端隐藏层与其近邻K个单词隐藏层更相关,目标端隐藏层与其历史M个单词隐藏层更相关。在大规模中英数据集上的实验结果表明,相比于神经机器翻译中仅考虑目标端对应源端的关联性,所提方法可以构建更好的近邻关联表示,提升机器翻译系统的译文质量。
中图分类号:
[1]LI Y C,XIONG D Y,ZHANG M.A survey of neural machine translation[OL].http://cjc.ict.ac.cn/online/bfpub/lyc-20171229152034.pdf. [2]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J]. arXiv preprint arXiv:1409.0473,2014. [3]SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[C]∥Advances in Neural Information Processing Systems.2014:3104-3112. [4]LUONGM T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[J].arXiv preprint arXiv:1508.04025,2015. [5]LIUL M,UTIYAMA M,FINCH A,et al.Neural MachineTranslationwith Supervised Attention[J].arXiv preprint arXiv:1609.04186,2016. [6]MI H T,WANG Z G,ITTYCHERIAH A.Supervised Atten-tions for Neural Machine Translation[J].arXiv preprint arXiv:1608.00112,2016. [7]CHEN W H,MATUSOV E,KHADIVI S,et al.Guided Alignment Training for Topic-aware Neural Machine Translation[J].arXiv preprint arXiv:1607.01628,2016. [8]GEHRING J,AULI M,GRANGIER D,et al.ConvolutionalSequence to Sequence Learning[J].arXiv preprint arXiv:1705.03122,2017. [9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need [C]∥Advances in Neural Information Processing Systems.2017:5998-6008. [10]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vectorspace[J].arXiv preprint arXiv:1301.3781,2013. [11]HOCHREITER S,SCHMIDHUBER J.Long Short-term Me-mory[J].Neural Computation,1997,9(8):1735-1780. [12]CHO K,VAN MERRIENBOER B,GULERHRE C,et al. Learning Phrase Representations using RNN Encoder-decoder for Statistical Machine Translation [J].arXiv preprint arXiv:1406.1078,2014. [13]NEUBIG G,DYER C,GOLDBERG Y,et al.Dynet:The Dy-namic Neural Network Toolkit[J].arXiv preprint arXiv:1701.03980,2017. [14]NEUBIG G.lamtram:A Toolkit for Language and Translation Modeling using Neural Networks[OL].http://www.github.com/neubig/lamtram. [15]KINGMA D,BA J.Adam:A Method For Stochastic Optimization[J].arXiv preprint arXiv:1412.6980,2014. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[3] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[4] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[8] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[9] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[10] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[11] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[12] | 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075 |
[13] | 彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093 |
[14] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[15] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
|