Computer Science ›› 2017, Vol. 44 ›› Issue (8): 265-269.doi: 10.11896/j.issn.1002-137X.2017.08.045

Previous Articles     Next Articles

Chinese Word Sense Induction Model by Integrating Distance Metric and Gaussian Mixture Model

ZHANG Yi-hao, LIU Zhi and ZHU Chang-peng   

  • Online:2018-11-13 Published:2018-11-13

Abstract: Word sense induction is an important topic in solving knowledge acquisition of word sense,and the most widely used method to word sense induction is based on cluster analysis algorithm.By comparing K-Means clustering algorithm with EM clustering algorithm on the model of word sense induction,we proposed a new hybrid clustering algorithm by integrating distance metric and Gaussian mixture model,which combine the advantages of distance metric and data distributed computing in the two cluster algorithms respectively to mine the role of geometrical properties and normal distribution information of training data in clustering analysis and then improve the performance of performance of word sense model.Experimental results show that the hybrid clustering algorithm proposed in this paper is very effective to improve the performance of word sense induction model.

Key words: Word sense induction,Distance metric,Gaussian mixture model,Hybrid clustering

[1] CLAUDIO D B,LUIS E A,ROBERTO N.Knowledge base unification via sense embeddings and disambiguation[C]∥Procee-dings of the 2015 Conference on Empirical Methods in Natural Language (EMNLP).2015:726-36.
[2] LU W P,HUANG H Y.Word Sense Disambiguation Based on Dependency Fitness with Automatic Knowledge Acquisition[J].Journal of Software,2013,4(10):2300-2311.(in Chinese) 鹿文鹏,黄河燕.基于依存适配度的知识自动获取词义消歧方法[J].软件学报,2013,24(10):2300-2311.
[3] SCHADD F C,ROOS N.Word-sense disambiguation for ontolo-gy mapping:Concept disambiguation using virtual documents and information retrieval techniques[J].Journal on Data Semantics,2015,4(3):167-186.
[4] YU J,LI C,HONG W,et al.A new approach of rules extraction for word sense disambiguation by features of attributes [J].Applied Soft Computing,2015,27:411-419.
[5] ETTINGER A,RESNIK P,CARPUAT M.Retrofitting sense-specific word vectors using parallel text[C]∥Proceedings of NAACL-HLT.2016:1378-1383.
[6] AKKAYA C,WIEBE J,MIHALCEA R.Iterative ConstrainedClustering for Subjectivity Word Sense Disambiguation[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.2014:269-278.
[7] KLAPAFTIS I P,MANANDHAR S.Word sense induction & disambiguation using hierarchical random graphs[C]∥Procee-dings of the 2010 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics.2010:745-755.
[8] TANG G B,YU D,XUN E D.An Unsupervised Word Sense Disambiguation Method Based on Sememe Vector in HowNet[J].Journal of Chinese Information Processing,2015,29(6):23-29.(in Chinese) 唐共波,于东,荀恩东.基于知网义原词向量表示的无监督词义消歧方法[J].中文信息学报,2015,29(6):23-29.
[9] QIAN T,JI D H,DAI W H.A Hypergraph Model for Word Sense Induction [J].Journal of Sichuan University (Engineering Science Edition),2016,48(1):152-157.(in Chinese) 钱涛,姬东鸿,戴文华.一个基于超图的词义归纳模型[J].四川大学学报(工程科学版),2016,48(1):152-157.
[10] VAN DE CRUYS T,POIBEAU T,KORHONEN A.Latentvector weighting for word meaning in context[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2011:1012-1022.
[11] LAU J H,COOK P,MCCARTHY D,et al.Word sense induction for novel sense detection[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.2012:591-601.
[12] HUANG Y,SHI X,SU J,et al.Unsupervised word sense induction using rival penalized competitive learning [J].Engineering Applications of Artificial Intelligence,2015,41:166-174.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!