Survey of Concept Drift Handling Methods in Data Streams

CHEN Zhi-qiang, HAN Meng, LI Mu-hang, WU Hong-xin, ZHANG Xi-long   

  1. School of Computer Science and Engineering,North Minzu University,Yinchuan 750021,China
  • Received:2021-07-12 Revised:2021-12-10 Online:2022-09-15 Published:2022-09-09
  • About author:CHEN Zhi-qiang,born in 1998,postgraduate.His main research interests include data stream classification and so on.
    HAN Meng,born in 1982,Ph.D,asso-ciate professor,graduate supervisor,is a member of China Computer Federation.Her main research interests include data mining and so on.
  • Supported by:
    National Natural Science Foundation of China(62062004) and Ningxia Natural Science Foundation Project(2020AAC03216).

Abstract: At present,concept drift in the nonstationary data stream presents a trend of different speeds and and different space distribution,which has brought great challenges to many fields such as data mining and machine learning.In the past two de-cades,many methods dedicated to handling concept drift in nonstationary data streamsemerged.A novel perspective is proposed to classify these methods.The current concept drift handling methods are comprehensively explained from the explicit method of actively detection and the implicit method of passively adaption.In particular,active detection methods are analyzed from the per-spective of handling one specific type of concept drift and handling multiple types of concept drift,and passive adaptive methods are analyzed from the perspectives of single learner and ensemble learning.Many concept drift handling methods are analyzed and summarized in terms of the comparison algorithm,learning model,applicable drift type,advantages and disadvantages of algorithms.Finally,further research directions are given,including the concept drift handling methods in class-imbalanced data streams,the concept drift handling methods in data stream with the existence of novel classes,and the concept drift handling methods in the data stream with noise.

Key words: Data stream, Concept drift, Classification, Active methods, Passive method

