计算机科学 ›› 2013, Vol. 40 ›› Issue (Z11): 267-269.
周由,戴牡红
ZHOU You and DAI Mu-hong
摘要: 在新闻项目的推荐系统中,通常使用TF-IDF权重技术结合余弦相似性度量方法,然而这种技术没有考虑到文字本身的实际语义,因此,提出了基于内容和语义分析相结合的一种新方法。此方法将同义词集合的逆文档频率及语义相似性相结合,采用WordNet同义词集合做相似性计算。构建用户配置文件进行实验测试,验证了该方法的有效性。实验结果表明,提出的语义方法性能优于TF-IDF方法。
[1] 华秀丽,朱巧明,李培峰.语义分析与词频统计相结合的中文文本相似度量方法研究[J].计算机应用研究,2011,9(3):834-836 [2] Goossen F,Jntema W,Frasincar F,et al.News Personalization using the CF-IDF Semantic Recommender[C]∥Proc of the International Conference on Web Intelligence,Mining and Semantics.2011 [3] 黄承慧,印鉴,侯昉.一种结合词项语义信息和TF-IDF方法的文本相似性度量方法[J].计算机学报,2011,4(5):857-863 [4] 李明涛,罗军勇,尹美娟,等.结合词义的文本特征权重计算方法[J].计算机应用,2012,2(5):1355-1358 [5] Toutanova K,Klein D,Manning C D,et al.Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network[C]∥Proc of “ NAACL”.2003:173-180 [6] Jensen A S,Boss N S.Dty similarity.http://damn.dk/similarity/javadoc/model/similarity/Lesk.html,2008 [7] Lextek:Onix Text Retrieval Toolkit {API Reference.http://www.lextek.com/manuals/onix/stopwords1.html (2011)(stop word) [8] Jiang J J,Conrath D W.Semantic Similarity Basedon CorpusStatistics and Lexical Taxonomy[J].Proc of 10th International Conference on Research in Computational Linguistics,1997,9(33) [9] Fellbaum C.WordNet:an electronic lexical database.WordNet is available from http://www.cogsci.princeton.edu/wn,2010 [10] Resnik P.Using Information Content to Evaluate Semantic Similarity in a Taxonomy[C]∥Proc of the 14th International Joint Conference on Artificial Intelligence.1995,1:448-453 [11] Wu Zhi-biao,Palmer M.Verb Semantics and Lexical Selection[C]∥Proc of 32nd Annual Meeting on Association for Computational Linguistics.1994:133-138 |
No related articles found! |
|