Computer Science ›› 2010, Vol. 37 ›› Issue (3): 6-10.

Previous Articles     Next Articles

Survey of Chinese New Words Identification

ZHANG Hai-jun,SHI Shu-min,ZHU Chao-yong,HUANG He-yan   

  • Online:2018-12-01 Published:2018-12-01

Abstract: New Words Identification (NWI) is a key technology in the field of Chinese information processing. NWI mainly includes two tasks;one is new words candidate extracting and filtering, the other is new words POS guessing.Since there is no specific symbol to mark word boundary for Chinese words,any adjacent characters are possible to compose a word, which brings a lot of obstacles for NWI. Moreover, because the prior knowledge and statistical data are not available, new words POS guessing has become the technological bottleneck of Chinese tagging. The status of the field for Chinese NWI was analyzed in detail, and the research techniques and existing problems for new words candidates extrading and new words POS guessing were discussed emphatically. In the end, the paper presented the prospects of the study for Chinese NWI.

Key words: Ncw words Identification, Unknown words, Candidate string, Draining corpus, POS guessing

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!