计算机科学 ›› 2011, Vol. 38 ›› Issue (11): 148-152.

• 数据库与数据挖掘 • 上一篇    下一篇

一种改进的基于后缀树模型搜索结果聚类算法

刘德山   

  1. (辽宁师范大学计算机与信息技术学院 大连116081)
  • 出版日期:2018-12-01 发布日期:2018-12-01

Improved Search Results Clustering Algorithm Based on Suffix Tree Model

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对现有搜索结果分类算法在聚类标签筛选、聚类质量评估及控制重叠聚类方面的缺陷,提出了一种改进的基于向量空间模型与后缀树模型的检索结果聚类算法,从而完善了LINGO算法的聚类及聚类标签打分函数,增加了基本类合并过程,改善了对中文的处理效果。最后对算法的分类效果及产生标签的质量进了实验分析,基于carrot2框架,建立了Wcb搜索结果聚类推荐平台。验证了CQIG算法分类的准确性和聚类标签的区分性和可读性。

关键词: 搜索结果聚类,后缀树模型,向量空间模型,奇异值分解

Abstract: To make up for the deficiencies in clustering label selection, clustering quality evaluating and the control of overlapping clustering in the existing search results classification algorithm, this paper proposed an improved search resups clustering algorithm based on vector space model and suffix tree model. We modified LINGO algorithm's clustering function and clustering label scoring function,basic clustering merging process was added and the treatment effect of Chinese was improved. Finally, we analyzed the algorithm's classification results and the generated label's quality according to the experiment results. What' s more, a platform for recommended Web search results clustering based on carrotz framework was established and CQIG algorithm's classification accuracy and clustering label’s discriminative and readability were confirmed on this platform.

Key words: Search results clustering, Suffix tree model, Vector space model, Singular value decomposition

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!