基于模糊粒度计算的K-means文本聚类算法研究

计算机科学 ›› 2010, Vol. 37 ›› Issue (2): 209-211.

基于模糊粒度计算的K-means文本聚类算法研究

张霞,王素贞,尹怡欣,赵海龙

(河北经贸大学计算机中心石家庄050061);(北京科技大学信息工程学院北京100083)

出版日期:2018-12-01 发布日期:2018-12-01
基金资助:
本文受国家自然科学基金项目(60394D32)，河北省教育厅科研计划项目(2009116)资助。

Research of Text Clustering Based on Fuzzy Granular Computing

ZHANG Xia,WANG Su-zhen,YIN Yi-xin,ZHAO Hai-long

Online:2018-12-01 Published:2018-12-01

摘要/Abstract

摘要： 传统的K-means算法对初始聚类中心非常敏感，聚类结果随不同的初始输入而波动，算法的稳定性下降。针对这个问题，提出了一种优化初始聚类中心的新算法:在数据对象的模糊粒度空间上给定一个归一化的距离函数，用此函数对所有距离小于粒度dλ的数据对象进行初始聚类，对初始聚类簇计算其中心，得到一组优化的聚类初始值。实验对比证明，新算法有效地消除了传统K-means算法对初始输入的敏感性，提高了算法的稳定性和准确率。

关键词: 模糊，粒度，K-means，文本聚类，归一化距离函数

Abstract: The traditional K-means is very sensitive to initial clustering centers and the clustering result will wave with the different initial input. To remove this sensitivity, a new method was proposed to get initial clustering centers. This method is as follows; provide a normalized distance function in the fuzzy granularity space of data objects, then use the function to do a initial clustering work to these data objects who has a less distance than granularity试，then get the initial clustering centers. The test shows this method has such advantages on increasing the rate of accuracy and reducing the program times.

Key words: Fuzzy,Uranular computing,K-means,Text cluster,Normalized distance function

张霞,王素贞,尹怡欣,赵海龙. 基于模糊粒度计算的K-means文本聚类算法研究[J]. 计算机科学, 2010, 37(2): 209-211. https://doi.org/

ZHANG Xia,WANG Su-zhen,YIN Yi-xin,ZHAO Hai-long. Research of Text Clustering Based on Fuzzy Granular Computing[J]. Computer Science, 2010, 37(2): 209-211. https://doi.org/

参考文献

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed