Computer Science ›› 2012, Vol. 39 ›› Issue (3): 183-186.

Previous Articles     Next Articles

Extracting Abbreviated Names for Chinese Entities from the Web

DING Yuan-jun,CAO Cun-gen,WANG Shi,FU Jian-hui   

  • Online:2018-11-16 Published:2018-11-16

Abstract: Abbreviations arc the essential parts of the vocabularies in natural language, therefore, acquiring abbrevia- dons is a basic and significant task of natural language processing. We proposed a method of extracting abbreviations for the given Chinese full names from the Web. The method has two phases: candidate abbreviations extraction and verifica- tion. In the extraction phase, we constructed query items to issue to Google and saved the results as the corpora, from which we extracted candidate abbreviations. In the verification phase, we defined a full names/abbreviations relations constraints, which includes a group of constraint axioms and a group of constraint functions. We built a relation graph to reflect the connection of all the full names and the abbreviations. In the process of verifying, the incorrect ones could be filtered out using the constraint axioms and the relation graph; the candidate abbreviations could be classified with the constraint functions,and the incorrect ones could be identified through a classifier,which was trained using the types of the candidate abbreviations, the values of the functions and the tags in the corpora. Comprehensive experiments show that the precision and recall rate of extracting abbreviation extraction arc 94. 63 0 0 and 84. 09%,respectively, and the precision of candidate verification is 94. 81%.

Key words: Natural language processing, Abbreviation acctuisition, Constraint axioms Constraint functions. Relation group

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!