Computer Science ›› 2010, Vol. 37 ›› Issue (4): 215-.
Previous Articles Next Articles
TIAN Sheng-wei,TURGUN Ibrahim,YU Long,JAMILA Wushouer,YANG Fei-yu
Online:
Published:
Abstract: This paper proposed a hybrid algorithm of sentence alignment in Chinese-Uyhur parallel corpora. Aiming at the shortcoming of mistake spread in alignment algorithm based on length, this paper presented a new kind of suppression strategy for mistake spread. By using csentence length and ChinescUyhur correspondence information, the anchor points with 1:1 pattern sentence pairs are identify to suppress mistakes spread. Among anchor points,a approach based on both length and punctuation is used to align sentences. Experimental results verify the high precision of identifying anchor points and the effective restraint of the spread of mistakes; Hybrid alignmentd algorithm avoids the weakness of high time complexity algorithms based on words. In addition, its performance is improved more compare with traditional alignment algorithms, and increase alignment accuarcy from 95. 0% to 97. 6% and recall from 96. 8% to 98. 2% , and the validity evaluation method can find the noised alignment efficently.
Key words: Bilingual corpora, Error curb, Hybrid strategy, Sentence alignment, ChinescUyhur sentence
TIAN Sheng-wei,TURGUN Ibrahim,YU Long,JAMILA Wushouer,YANG Fei-yu. Chinese-Uyhur Sentence Alignment Based on Hybrid Strategy[J].Computer Science, 2010, 37(4): 215-.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2010/V37/I4/215
Cited