计算机科学 ›› 2010, Vol. 37 ›› Issue (1): 205-207.

• 人工智能 • 上一篇    下一篇

一种挖掘概念漂移数据流的选择性集成算法

关菁华,刘大有   

  1. (吉林大学符号计算与知识工程教育部重点实验室 长春130012);(吉林大学计算机科学与技术学院 长春130012)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金重大项目(60496321),国家自然科学基金项目(60373098,60573073) ,国家高技术研究发展计划项目(20060110Z2037),吉林省科技发展计划重大项目(20020303,吉林省科技发展计划项目(20030523),欧盟项目TH/Asia Link/010(111084)资助。

Selected Ensemble of Classifiers for Handling Concept-drifting Data Streams

GUAN Jing-hua,LIU Da-you   

  • Online:2018-12-01 Published:2018-12-01

摘要: 提出一种挖掘概念漂移数据流的选择性集成学习算法。该算法根据各基分类器在验证集上的输出结果向量方向与参考向量方向之间的偏离程度,选择参与集成的基分类器。分别在具有突发性和渐进性概念漂移的人造数据集SEA和Hyperplane上进行实验分析。实验结果表明,这种基分类器选择方法大幅度提高了集成算法在处理概念漂移数据流时的分类准确性。使用error-ambiguity分解对算法构建的naive Bayes集成在解决分类问题时的性能进行了分析。实验结果表明,算法成功的主要原因是它能显著降低平均泛化误差。

关键词: 概念漂移,选择性集成,朴素贝叶斯,error-ambiguity分解

Abstract: In data streams concept is often not stable but change with time. We proposed a selective integration algorithm OSEN(Orientation based Selected ENsemble) for handling concept drift data streams. This algorithm selects a near optimal subset of base classifiers based on the output of each base classifier on validation dataset. Our experiments with synthetic data sets simulating abrupt (SEA) and gradual (Hyperplane) concept drifts demonstrate that selective integration of classifiers built over small time intervals or fixed-sized data blocks can be significantly better than majority voting and weighted voting, which are currently the most commonly used integration technictues for handling concept drift with ensembles. This paper also explained the working mechanism of OSEN from error-ambiguity decomposition. Based on experiments, OSEN improves the generalination ability through reducing the average generalization error of the base classifiers constituting the ensembles.

Key words: Concept drift, Selective ensemble, Naive baycs, Error-ambiguity decomposition

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!