计算机科学 ›› 2017, Vol. 44 ›› Issue (5): 280-284.doi: 10.11896/j.issn.1002-137X.2017.05.051
杨进才,陈忠忠,沈显君,胡金柱
YANG Jin-cai, CHEN Zhong-zhong, SHEN Xian-jun and HU Jin-zhu
摘要: 语义相关度计算作为中文信息处理领域中的一项关键技术,在信息检索、语义消岐、文本分类中起着重要的作用。利用汉语复句的句法理论和关系标记搭配理论,以汉语复句语料库以及搜索引擎获取的复句为语料,提出了一种基于汉语复句的语义相关度计算方法——SRCCS。本方法不仅能够计算词语的相关度,而且能够表明相关的性质与类别。与通过短文计算相关度的方法相比,本方法选取的计算对象范围更小,因而结果更准确,计算复杂度更低。在同一测试集上与搜索引擎方法的对比分析证明了基于汉语复句的语义相关度计算方法的有效性与优越性。
[1] KJOS-HANSSEN B,Evangelista A J.Google distance between words.http:/math.hawaii.edu/~bjoern/Publications/Evangelista_Kjos-Hanssen.pdf. [2] 姚双云.复句关系标记的搭配研究[M].武汉:华中师范大学出版社,2008. [3] YOU B.Measuring Semantic Relatedness between Words[D].Wuhan:Central China Normal University Press,2013.(in Chinese) 游博.词语语义相关度计算研究[D].武汉:华中师范大学,2013. [4] XU Y,FAN X Z,ZHANG F.Semantic Relevancy Computing Based on Hownet[J].Transactions of Beijing Institute of Technology,2005,5(5):411-414.(in Chinese) 许云,樊孝忠,张锋.基于知网的语义相关度计算[J].北京理工大学学报,2005,5(5):411-414. [5] WANG H L,LV Q,XU R.Computation model of Chinese semantic relevancy based on HowNet[C]∥The National Acade-mic Conference on Information Retrieval and Information Content Security.2007.(in Chinese) 王红玲,吕强,徐瑞.一种基于知网的中文语义相关度计算模型[C]∥全国信息检索与内容安全学术会议.2007. [6] WANG J H,ZUO W L,YAN Z.Word Semantic Similarity Mea-surement Based on Naive Bayes Model[J].Journal of Computer Research and Development,2015,2(7):1499-1509.(in Chinese) 王俊华,左万利,闫昭.基于朴素贝叶斯模型的单词语义相似度度量[J].计算机研究与发展,2015,2(7):1499-1509. [7] AOUICHA M B,TAIEB M A H,HAMADOU A B.Taxonomy-based information content and wordnet-wiktionary-wikipedia glosses for semantic relatedness[J].Applied Intelligence,2016,5(2):1-37. [8] LI W,YANG C,FU X.Combining How Net and Extension StrategyGeneration Method to Improve Customer Values[J].Procedia Computer Science,2015,55:451-460. [9] XIANG C C,SUI Z F,ZHAN W D.On Mapping between HowNet and CCD[J].Journal of Chinese Information Processing,2015,9(3):44-51.(in Chinese) 向春丞,穗志方,詹卫东.HowNet与CCD映射方法研究[J].中文信息学报,2015,9(3):44-51. [10] KIMTANI D K,CHOUDHURY J,C HAKRABARTY A.Improvement in Word Sense Disambiguation by introducing enhancements in English WordNetStructure[J].International Journal on Computer Science & Engineering,2012,4(7):1366-1370. [11] XIAO S,HU J Z,YAO S Y,et al.Objectorient ontology modeling for tag complex sentence[J].Application Research of Computer,2010,27(2):552-554.(in Chinese) 肖升,胡金柱,姚双云,等.面向对象有标复句本体建模[J].计算机应用研究,2010,27(2):552-554. [12] WANG Z H,WANG L Y,DANG H,et al.Web ClusteringBased on Hybrid Probabilistic Latent Semantic Analysis Model[J].Journal of Computer Applications,2012,2(11):3018-3022.(in Chinese) 王治和,王凌云,党辉,等.基于混合概率潜在语义分析模型的Web聚类[J].计算机应用,2012,2(11):3018-3022. [13] STRUBE B M,PONZETTO S P.WikiRelate! Computing semantic relatedness using Wikipedia[C]∥Proc.of AAAI-06.2015:1419-1424. [14] WAN F Q,WU Y F.Computing Lexical Semantic relevancywith Chinese Wikipedia[J].Journal of Chinese Information Processing,2013,7(6):31-37,9.(in Chinese) 万富强,吴云芳.基于中文维基百科的词语语义相关度计算[J].中文信息学报,2013,7(6):31-37,9. [15] 邢福义.汉语复句研究[M].北京:商务印书馆,2001. [16] CRISTIANINI N,SHAWE-TAYLOR J,L ODHI H.Latent semantic kernels[J].Journal of Intelligent Information Systems,2002,18(2/3):127-152. |
No related articles found! |
|