计算机科学 ›› 2012, Vol. 39 ›› Issue (7): 250-252.

• 人工智能 • 上一篇    下一篇

一种基于权重的文本特征选择方法

雷军程,黄同成,柳小文   

  1. (长沙理工大学计算机与通信工程学院 长沙 410076) (邵阳学院信息工程系 邵阳 422000)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Improved Text Feature Selection Method Based on Text Feature Weight

  • Online:2018-11-16 Published:2018-11-16

摘要: 在分析比较几种常用的特征选择方法的基础上,提出了一种引入文本类区分加权频率的特征选择方法TFIDF_Ci。它将具体类的文档出现频率引入TFIDF函数,提高了特征项所在文档所属类区分其他类的能力。实验中采用KNN分类算法对该方法和其他特征选择方法进行了比较测试。结果表明,TFIDF_Ci方法较其他方法在不同的训练集规模情况下具有更高的分类精度和稳定性。

关键词: 特征选择,TFIDF, KNN分类算法

Abstract: This paper compared several feature selection methods in text categorization, and proposed a new feature se-lection method(TF)DF Ci) based on weighted frequency of distinction between the text. It improves TF)DF function from weighted freduency and the feature items can increase the ability of text categorization in documents. In the experiment, we tested the effect of this feature selection method and other feature selection methods by using KNN classifiers. The experiments show the new method has good performance and stability under different numbers of training sets.

Key words: Feature selection,TFIDF,KNN classifiers

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!