Computer Science ›› 2010, Vol. 37 ›› Issue (4): 215-.

Previous Articles     Next Articles

Chinese-Uyhur Sentence Alignment Based on Hybrid Strategy

TIAN Sheng-wei,TURGUN Ibrahim,YU Long,JAMILA Wushouer,YANG Fei-yu   

  • Online:2018-12-01 Published:2018-12-01

Abstract: This paper proposed a hybrid algorithm of sentence alignment in Chinese-Uyhur parallel corpora. Aiming at the shortcoming of mistake spread in alignment algorithm based on length, this paper presented a new kind of suppression strategy for mistake spread. By using csentence length and ChinescUyhur correspondence information, the anchor points with 1:1 pattern sentence pairs are identify to suppress mistakes spread. Among anchor points,a approach based on both length and punctuation is used to align sentences. Experimental results verify the high precision of identifying anchor points and the effective restraint of the spread of mistakes; Hybrid alignmentd algorithm avoids the weakness of high time complexity algorithms based on words. In addition, its performance is improved more compare with traditional alignment algorithms, and increase alignment accuarcy from 95. 0% to 97. 6% and recall from 96. 8% to 98. 2% , and the validity evaluation method can find the noised alignment efficently.

Key words: Bilingual corpora, Error curb, Hybrid strategy, Sentence alignment, ChinescUyhur sentence

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!