Computer Science ›› 2013, Vol. 40 ›› Issue (12): 282-286.

Previous Articles     Next Articles

Research of Word Sense Disambiguation Based on Combination of Rules and Statistics

MIAO Hai and ZHANG Yang-sen   

  • Online:2018-11-16 Published:2018-11-16

Abstract: In this paper,various structure knowledge dictionaries were analyzed in the computability and computational complexity aspects.The grammatical knowledge-base of contemporary Chinese and Modern Chinese Semantic Dictionary,both from the Institute of Computational Chinese Linguistics of Peking University,were chosen as the knowledge source.Fusion method of more heterogeneous source was considered,and agile rules knowledge base and lexical collocation library were constructed,and a word sense disambiguation method of rules and statistics combination was designed.The method of combining maximum entropy and rule presents the highest accuracy in many kinds of word sense disambiguation method.Compared to the best result in the SemEval 2007(task #5),the MicroAve (micro-average accuracy) and MacroAve (macro-average accuracy) are promoted by 5.5% and 0.9%.

Key words: Word sense disambiguation,Knowledge source,Rule,Statistics

[1] Wu Yun-fang,Jin Peng,Zhang Yang-sen,et al.A Chinese Corpus with Word Sense Annotation[C]∥Proceedings of 21th International Conference on Computer Processing of Oriental Languages.Singapore,2006:12
[2] 张仰森,黄改娟.基于多知识源的汉语词义消歧方法[J].汉语学报,2008(2):46-52
[3] Jaynes E T.Information Theory and Statistical Mechanics[J].Physical Review,1957,106(4):620-630
[4] Wang Shao-jun,Schuurmans D,Zhao Yun-xin.The Latent Ma-ximum Entropy Principle[C]∥IEEE International Symposium on Information Theory.2002:182-185
[5] 李生,张晶,赵铁军,等.词义消歧研究的现状与发展方向[J].计算机科学,2001,8(9):95-98
[6] 张仰森,郭江.4种统计词义消歧模型的分析与比较[J].北京信息科技大学学报:自然科学版,2011,6(2):13-18
[7] 何径舟,王厚峰.基于特征选择和最大熵模型的汉语词义消歧[J].软件学报,2010,1(6):1287-1295

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!