计算机科学 ›› 2012, Vol. 39 ›› Issue (Z6): 304-308.

• • 上一篇    下一篇

非平衡数据集分类方法探讨

职为梅,郭华平,范明,叶阳东   

  1. (郑州大学信息工程学院 郑州450052)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Discussion of Classification for Imbalanced Data Sets

  • Online:2018-11-16 Published:2018-11-16

摘要: 由于数据集中类分布极不平衡,很多分类算法在非平衡数据集上失效,而非平衡数据集中占少数的类在现实生活中通常具有显著意义,因此如何提高非平衡数据集中少数类的分类性能成为近年来研究的热点。详细讨论了非平衡数据集分类问题的本质、影响非平衡数据集分类的因素、非平衡数据集分类通常采用的方法、常用的评估标准以及该问题中存在的问题与挑战。

关键词: 非平衡数据集,分类,抽样技术,代价敏感学习

Abstract: Because of imbalanced class distribution, most classifiers lose efficiency with it. In fact the rarely occurring class in imbalanced datasets shows statistical significance. The problem of learning from imbalanced datasets has attracted growing attention in recent years. The paper provide a comprehensive review of the classification of imbalanced datasets, the nature of the problem, the factor which affected the problem, the current assessment metrics used to evaluate learning performance, as well as the opportunities and challenges in the learning from imbalanced data.

Key words: Imbalanccd data sets,Classification,Sampling methods,Cost-scnsitive learning

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!