无标记训练样本的Web文本分类方法

计算机科学 ›› 2006, Vol. 33 ›› Issue (3): 200-201.

无标记训练样本的Web文本分类方法

出版日期:2018-11-17 发布日期:2018-11-17
基金资助:
973国家重点基础研究项目（G1998030414）;北京市优秀人才专项经费资助项目（20042D0501604）.

Online:2018-11-17 Published:2018-11-17

摘要/Abstract

摘要： 在文本分类中获得有类别标记训练样本的代价是很高昂的，本文针对这个问题对传统的模糊聚类方法进行改进，提出模糊划分聚类方法FPCM，将聚类的无监督性和样本的先验知识结合起来，通过相似度度量聚类相关文本，取得比较客观的簇和少量标记文本，为监督学习找到分类依据，并结合朴素贝叶斯增量学习方式进行分类器的学习.本文进一步用估计分类误差损失的方法平衡选取候选样本，提高了分类准确率，实现了应用范围更加广泛的无标记文本分类学习模型.

关键词: Web文本分类模糊聚类朴素贝叶斯

Abstract: Bayes learning theory is to obtain estimate of non-labeled samples by transcendental information and sample data. The application of text classification is to classify non-labeled texts by learning labeled class samples. But it is very difficult to obtain

Key words: Web text classification, Fuzzy clustering, Naive Bayes

. 无标记训练样本的Web文本分类方法[J]. 计算机科学, 2006, 33(3): 200-201. https://doi.org/

参考文献

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed