Computer Science ›› 2010, Vol. 37 ›› Issue (3): 6-10.
Previous Articles Next Articles
ZHANG Hai-jun,SHI Shu-min,ZHU Chao-yong,HUANG He-yan
Online:
Published:
Abstract: New Words Identification (NWI) is a key technology in the field of Chinese information processing. NWI mainly includes two tasks;one is new words candidate extracting and filtering, the other is new words POS guessing.Since there is no specific symbol to mark word boundary for Chinese words,any adjacent characters are possible to compose a word, which brings a lot of obstacles for NWI. Moreover, because the prior knowledge and statistical data are not available, new words POS guessing has become the technological bottleneck of Chinese tagging. The status of the field for Chinese NWI was analyzed in detail, and the research techniques and existing problems for new words candidates extrading and new words POS guessing were discussed emphatically. In the end, the paper presented the prospects of the study for Chinese NWI.
Key words: Ncw words Identification, Unknown words, Candidate string, Draining corpus, POS guessing
ZHANG Hai-jun,SHI Shu-min,ZHU Chao-yong,HUANG He-yan. Survey of Chinese New Words Identification[J].Computer Science, 2010, 37(3): 6-10.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2010/V37/I3/6
Cited