计算机科学 ›› 2009, Vol. 36 ›› Issue (11): 213-216.

• 人工智能 • 上一篇    下一篇

基于类别选择的改进KNN文本分类

刘海峰,张学仁,姚泽清,刘守生   

  1. (解放军理工大学理学院 南京210007)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金项目(编号:70571087)资助。

Improved Sort-based KNN Text Categorization Method

LIU Hai-feng,ZHANG Xue-ren,YAO Ze-qing,LIU Shou-sheng   

  • Online:2018-11-16 Published:2018-11-16

摘要: 特征高维性以及算法的泛化能力影响了KNN分类器的分类性能。提出了一种降维条件下基于类别的KNN改进模型,解决了k近部选择时大类别、高密度样本占优问题。首先使用一种改进的优势率方法进行特征选择,随后使用类别向量对文本类别进行初步判定,最后在压缩后的样本集上使用KNN分类器进行分类。试验结果表明,提出的改进分类模型提高了分类效率。

关键词: k-最近邻,特征降维,特征选择,文本分类

Abstract: The problem of large feature dimension and the expansibility reduces the KNN function. This paper brougt forward an ameliorative KNN method to solve the problem that big swatch sort with more texts is easy to become the knearest neighbors under the feature reduction condition. Firstly, it used an improved odds radio method to select feature.Secondly, it estimated the possible sorts for the text by using the sort vector. Lastly, it used an improved KNN method in the reduced texts to realize text categorization. The experiment shows that this method has improved the precision.

Key words: KNN, Feature reduction, Feature selection, Text categorization

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!