自纠正词对齐

doi:10.11896/j.issn.1002-137X.2017.12.039

Abstract

Abstract: Word alignment is an important part of statistical machine translation systems.Previous works obtain word alignment through sequential models,which do not take into account the structure information and linguistic features of the language,leading to bad word alignments violating linguistic characteristics.This paper proposed a novel self-correction method for word alignments,aiming to correct the alignment errors which violate linguistic characteristics by exploiting linguistic prior knowledge.First,we conducted a coarse correction on short alignments obtained by binary segmentation based on punctuation method.Second,we proposed a fine-grained correction method for each short alignment based on statistical features.Third,corrected short alignments were merged to original alignments.This process does not rely on any third-party word aligner and additional parallel corpus.Experimental results show that our method significantly improves the accuracy machine translation results.

Key words: Self-correction,Word alignment,Coarse-to-fine

GONG Hui-min, DUAN Xiang-yu and ZHANG Min. Self-correction of Word Alignments[J].Computer Science, 2017, 44(12): 216-220.

0
/ / Recommend

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: https://www.jsjkx.com/EN/10.11896/j.issn.1002-137X.2017.12.039

https://www.jsjkx.com/EN/Y2017/V44/I12/216

References

[1] KOEHN P,OCH F J,MARCU D.Statistical phrase-basedtranslation[C]∥Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Association for Computational Linguistics,2003:127-133.
[2] LIU Y,LIU Q,LIN S.Tree-to-string alignment template for statistical machine translation[C]∥International Conference on Computational Linguistics and,Meeting of the Association for Computational Linguistics(ACL 2006).Sydney,2006:609-616.
[3] GALLEY M,GRAEHL J,KNIGH K,et al.Scalable inference and training of context-rich syntactic translation models[C]∥International Conference on Computational Linguistics and the,Meeting of the Association for Computational Linguistics.2012:961-968.
[4] CHIANG D.Hierarchical Phrase-Based Translation[J].Computational Linguistics,2007,33(2):201-228.
[5] BROWN P F,PIETRA V J D,PIETRA S A D,et al.The ma-thematics of statistical machine translation:parameter estimation[J].Computational Linguistics,1993,19(2):263-311.
[6] LIANG P,TASKAR B,KLEIN D.Alignment by agreement[C]∥North American Association for Computational Linguistics (NAACL).2006.
[7] XU J,ZENS R,NEY H.Partitioning parallel documents using binary segmentation[C]∥The Workshop on Statistical Machine Translation.Association for Computational Linguistics,2006:78-85.
[8] BLUNSOM P,COHN T,GOLDWATER S,et al.A Note on the Implementation of Hierarchical Dirichlet Processes[C]∥ International Joint Conference on Natural Language Processing of the Afnlp.DBLP,2009:337-340.
[9] GAO Q,VOGEL S.Parallel implementations of word alignment tool[C]∥Association for Computational Linguistics.2008:49-57.
[10] STOLCKE A.SRILM-an extensible language modeling toolkit[C]∥Proceedings of the 7th International Conference on Spoken Language Processing.2002:901-905.
[11] OCH F J,NEY H.A systematic comparison of various statistical alignment models[J].Computational Linguistics,2003,29(1):19-51.
[12] OCH F J.Minimum error rate training in statistical machinetranslation[C]∥ Meeting on Association for Computational Liguistics.1973:160-167.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Self-correction of Word Alignments

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0