计算机科学 ›› 2017, Vol. 44 ›› Issue (8): 265-269.doi: 10.11896/j.issn.1002-137X.2017.08.045

• 人工智能 • 上一篇    下一篇

融合距离度量和高斯混合模型的中文词义归纳模型

张宜浩,刘智,朱常鹏   

  1. 重庆理工大学计算机科学与工程学院 重庆400054,重庆理工大学计算机科学与工程学院 重庆400054,重庆理工大学计算机科学与工程学院 重庆400054
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受重庆市教委科学技术研究项目(kj1500920,kj1500916),国家自然科学基金项目(61603065)资助

Chinese Word Sense Induction Model by Integrating Distance Metric and Gaussian Mixture Model

ZHANG Yi-hao, LIU Zhi and ZHU Chang-peng   

  • Online:2018-11-13 Published:2018-11-13

摘要: 词义归纳是解决词义知识获取的重要研究课题,利用聚类算法对词义进行归纳分析是目前最广泛采用的方法。通过比较K-Means聚类算法和EM聚类算法在 各自 词义归纳模型上的优势,提出一种新的融合距离度量和高斯混合模型的聚类算法,以期利用两种聚类算法分别在距离度量和数据分布计算上的优势,挖掘数据的几何特性和正态分布信息在词义聚类分析中的作用,从而提高词义归纳模型的性能。实验结果表明,所提混合聚类算法对于改进词义归纳模型的性能是十分有效的。

关键词: 词义归纳,距离度量,高斯混合模型,混合聚类

Abstract: Word sense induction is an important topic in solving knowledge acquisition of word sense,and the most widely used method to word sense induction is based on cluster analysis algorithm.By comparing K-Means clustering algorithm with EM clustering algorithm on the model of word sense induction,we proposed a new hybrid clustering algorithm by integrating distance metric and Gaussian mixture model,which combine the advantages of distance metric and data distributed computing in the two cluster algorithms respectively to mine the role of geometrical properties and normal distribution information of training data in clustering analysis and then improve the performance of performance of word sense model.Experimental results show that the hybrid clustering algorithm proposed in this paper is very effective to improve the performance of word sense induction model.

Key words: Word sense induction,Distance metric,Gaussian mixture model,Hybrid clustering

[1] CLAUDIO D B,LUIS E A,ROBERTO N.Knowledge base unification via sense embeddings and disambiguation[C]∥Procee-dings of the 2015 Conference on Empirical Methods in Natural Language (EMNLP).2015:726-36.
[2] LU W P,HUANG H Y.Word Sense Disambiguation Based on Dependency Fitness with Automatic Knowledge Acquisition[J].Journal of Software,2013,4(10):2300-2311.(in Chinese) 鹿文鹏,黄河燕.基于依存适配度的知识自动获取词义消歧方法[J].软件学报,2013,24(10):2300-2311.
[3] SCHADD F C,ROOS N.Word-sense disambiguation for ontolo-gy mapping:Concept disambiguation using virtual documents and information retrieval techniques[J].Journal on Data Semantics,2015,4(3):167-186.
[4] YU J,LI C,HONG W,et al.A new approach of rules extraction for word sense disambiguation by features of attributes [J].Applied Soft Computing,2015,27:411-419.
[5] ETTINGER A,RESNIK P,CARPUAT M.Retrofitting sense-specific word vectors using parallel text[C]∥Proceedings of NAACL-HLT.2016:1378-1383.
[6] AKKAYA C,WIEBE J,MIHALCEA R.Iterative ConstrainedClustering for Subjectivity Word Sense Disambiguation[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.2014:269-278.
[7] KLAPAFTIS I P,MANANDHAR S.Word sense induction & disambiguation using hierarchical random graphs[C]∥Procee-dings of the 2010 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics.2010:745-755.
[8] TANG G B,YU D,XUN E D.An Unsupervised Word Sense Disambiguation Method Based on Sememe Vector in HowNet[J].Journal of Chinese Information Processing,2015,29(6):23-29.(in Chinese) 唐共波,于东,荀恩东.基于知网义原词向量表示的无监督词义消歧方法[J].中文信息学报,2015,29(6):23-29.
[9] QIAN T,JI D H,DAI W H.A Hypergraph Model for Word Sense Induction [J].Journal of Sichuan University (Engineering Science Edition),2016,48(1):152-157.(in Chinese) 钱涛,姬东鸿,戴文华.一个基于超图的词义归纳模型[J].四川大学学报(工程科学版),2016,48(1):152-157.
[10] VAN DE CRUYS T,POIBEAU T,KORHONEN A.Latentvector weighting for word meaning in context[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2011:1012-1022.
[11] LAU J H,COOK P,MCCARTHY D,et al.Word sense induction for novel sense detection[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.2012:591-601.
[12] HUANG Y,SHI X,SU J,et al.Unsupervised word sense induction using rival penalized competitive learning [J].Engineering Applications of Artificial Intelligence,2015,41:166-174.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!