计算机科学 ›› 2012, Vol. 39 ›› Issue (4): 181-184.

• 人工智能 • 上一篇    下一篇

基于相关性和冗余度的联合特征选择方法

周城,葛斌,唐九阳,肖卫东   

  1. (国防科技大学信息系统工程重点实验室,长沙410073)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Joint Feature Selection Method Based on Relevance and Redundancy

  • Online:2018-11-16 Published:2018-11-16

摘要: 比较研究了与类别信息无关的文档频率和与类别信息有关的信息增益、互信息和zx统计特征选择方法,在 此基础上分析了以往直接组合这两类特征选择方法的弊端,并提出基于相关性和冗余度的联合特征选择算法。该算 法将文档频率方法分别与信息增益、互信息和zr统计方法联合进行特征选择,旨在删除冗余特征,并保留有利于分类 的特征,从而提高文本情感分类效果。实验结果表明,该联合特征选择方法具有较好的性能,并且能够有效降低特征 维数。

关键词: 文本情感分类,联合特征选择,相关性,冗余特征

Abstract: Based on a comparative study of four feature selection methods, including document frequency(DF) unrelated to class information,and information gain(IG),mutual information(MI) and chi square statistic(CHI),which are related to class information, we analyzed the disadvantages of combining these two kinds of methods directly and proposed a joint feature selection method based on relevance and redundancy to joint DF and one of IU,MI and CHI. This approach aims to eliminate redundant features, find useful features for classification and consequently improve the accuracy of text sentiment classification. hhe results of the experiment show that the proposed method can not only improve the per- formance but also reduce the feature dimension.

Key words: Text sentiment classification,Joint feature selection,Rclevance,Redundant feature

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!