Computer Science ›› 2017, Vol. 44 ›› Issue (12): 216-220.doi: 10.11896/j.issn.1002-137X.2017.12.039

Previous Articles     Next Articles

Self-correction of Word Alignments

GONG Hui-min, DUAN Xiang-yu and ZHANG Min   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Word alignment is an important part of statistical machine translation systems.Previous works obtain word alignment through sequential models,which do not take into account the structure information and linguistic features of the language,leading to bad word alignments violating linguistic characteristics.This paper proposed a novel self-correction method for word alignments,aiming to correct the alignment errors which violate linguistic characteristics by exploiting linguistic prior knowledge.First,we conducted a coarse correction on short alignments obtained by binary segmentation based on punctuation method.Second,we proposed a fine-grained correction method for each short alignment based on statistical features.Third,corrected short alignments were merged to original alignments.This process does not rely on any third-party word aligner and additional parallel corpus.Experimental results show that our method significantly improves the accuracy machine translation results.

Key words: Self-correction,Word alignment,Coarse-to-fine

[1] KOEHN P,OCH F J,MARCU D.Statistical phrase-basedtranslation[C]∥Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Association for Computational Linguistics,2003:127-133.
[2] LIU Y,LIU Q,LIN S.Tree-to-string alignment template for statistical machine translation[C]∥International Conference on Computational Linguistics and,Meeting of the Association for Computational Linguistics(ACL 2006).Sydney,2006:609-616.
[3] GALLEY M,GRAEHL J,KNIGH K,et al.Scalable inference and training of context-rich syntactic translation models[C]∥International Conference on Computational Linguistics and the,Meeting of the Association for Computational Linguistics.2012:961-968.
[4] CHIANG D.Hierarchical Phrase-Based Translation[J].Computational Linguistics,2007,33(2):201-228.
[5] BROWN P F,PIETRA V J D,PIETRA S A D,et al.The ma-thematics of statistical machine translation:parameter estimation[J].Computational Linguistics,1993,19(2):263-311.
[6] LIANG P,TASKAR B,KLEIN D.Alignment by agreement[C]∥North American Association for Computational Linguistics (NAACL).2006.
[7] XU J,ZENS R,NEY H.Partitioning parallel documents using binary segmentation[C]∥The Workshop on Statistical Machine Translation.Association for Computational Linguistics,2006:78-85.
[8] BLUNSOM P,COHN T,GOLDWATER S,et al.A Note on the Implementation of Hierarchical Dirichlet Processes[C]∥ International Joint Conference on Natural Language Processing of the Afnlp.DBLP,2009:337-340.
[9] GAO Q,VOGEL S.Parallel implementations of word alignment tool[C]∥Association for Computational Linguistics.2008:49-57.
[10] STOLCKE A.SRILM-an extensible language modeling toolkit[C]∥Proceedings of the 7th International Conference on Spoken Language Processing.2002:901-905.
[11] OCH F J,NEY H.A systematic comparison of various statistical alignment models[J].Computational Linguistics,2003,29(1):19-51.
[12] OCH F J.Minimum error rate training in statistical machinetranslation[C]∥ Meeting on Association for Computational Liguistics.1973:160-167.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!