计算机科学 ›› 2014, Vol. 41 ›› Issue (5): 111-115.doi: 10.11896/j.issn.1002-137X.2014.05.024

• 网络与通信 • 上一篇    下一篇

基于SVM的中文类比检索方法

梁超,吕钊,顾君忠   

  1. 华东师范大学信息科学与技术学院 上海200241;华东师范大学信息科学与技术学院 上海200241;华东师范大学信息科学与技术学院 上海200241
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受上海市科学技术基金(11511504002)资助

Chinese Analogy Retrieval Using SVM

LIANG Chao,LV Zhao and GU Jun-zhong   

  • Online:2018-11-14 Published:2018-11-14

摘要: 随着互联网的不断发展,用户因不能准确输入查询关键字而无法准确获取未知领域信息的问题日益严重。作为一种根据已知领域知识获取未知领域知识的全新检索方式,类比检索逐渐成为研究热点。类比检索通过分析词对之间的潜在关系而准确地返回目标信息。例如,给定类比查询请求Q={A:B,C:?},A与B之间具有某种潜在关系,类比检索的目标是得到?所代表的目标词(集)D,其中A与B的关系和C与D的潜在关系相似。类比检索的两个难点是潜在关系挖掘和目标词抽取,这两个问题对于中文而言,更具挑战性。提出了基于SVM的中文类比检索方法(SVM based Chinese Analogy Retrieval,SVMbCAR)。该方法的两个主要成分包括基于SVM的关系代表词抽取和目标词确定。基于真实测试数据集(包含源自人立方的600个人物实体对)的实验表明,SVMbCAR方法抽取关系代表词的准确率为82.3%,抽取目标词的准确率为90.5%。

关键词: 类比检索,SVM,语义相似

Abstract: With the development of Internet,the problem of not acquiring information of unknown domains because not exactly import keywords becomes more common.As a new retrieve method of acquiring knowledge of unknown domains using the knowledge of known domains,analogy retrieval gradually becomes one of hot topics.Analogy retrieval first analyzes the potential relationships between pairs of words and then accurately returns target information using these relationships.For example,given an analogy query Q={A:B,C:?},here it is assumed that there are some potential relationships between A and B.The aim of analogy retrieval is to determine the target(s) D of ?,and the relationships between two pairs of words,A and B,C and D,are similar.Two key difficulties of analogy retrieval are:(1) mining relationships between two words and (2) extracting target words.Both of them are more challenging in Chinese.This paper proposed a SVM based Chinese Analogy Retrieval (namely SVMbCAR) with two main components,SVM based relation-words extracting and SVM based target words determining.Experiments on a real-life data set (600person entity pairs from Ren Li Fang) show that the accuracy of extracting relationships between two words is 82.3%,and the accuracy of extracting target words is 90.5%.

Key words: Analogy retrieval,SVM,Semantic similarity

[1] Kato M P,Ohshima H,Oyama S,et al.Query by analogical example:relational search using Web search engine indices[C]∥Proceedings of the 18th ACM Conference on Information and Knowledge Management.ACM,2009:27-36
[2] Duc N T,Bollegala D,Ishizuka M.Using relational similarity between word pairs for latent relational search on the web[C]∥ 2010IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). IEEE,2010,1:196-199
[3] Goto T,Duc N T,Bollegala D,et al.Exploiting symmetry in relational similarity for ranking relational search results[C]∥PRICAI 2010:Trends in Artificial Intelligence.Springer Berlin Heidelberg,2010:595-600
[4] Duc N T,Bollegala D,Ishizuka M.Cross-language latent rela-tional search:Mapping knowledge across languages[C]∥Proceedings of 25th AAAI Conference on Arificial Intelligence.2011:1237-1242
[5] Liang Chao,Lu Zhao.Chinese Latent Relational Search Based on Relational Similarity[M]∥Data and Knowledge Engineering.Springer Berlin Heidelberg,2012:115-127
[6] 中科院分词系统ICTCLAS.http://www.ictclas.org/.2012
[7] Chang C-C,Lin C-J.LIBSVM:a library for support vector machines[J].ACM Transactions on Intelligent Systems and Technology (TIST),2011,2(3):27
[8] 鲁松,白硕,黄雄.基于向量空间模型中义项词语的无导词义消歧[J].软件学报,2002,13(6):1082-1089
[9] 人立方关系搜索.http://renlifang.msra.cn/

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!