计算机科学 ›› 2016, Vol. 43 ›› Issue (4): 219-223.doi: 10.11896/j.issn.1002-137X.2016.04.045

• 人工智能 • 上一篇    下一篇

基于模糊聚类的数据流概念漂移检测算法

陈小东,孙力娟,韩崇,郭剑   

  1. 南京邮电大学计算机学院 南京210003,南京邮电大学计算机学院 南京210003;南京邮电大学江苏省无线传感网高技术研究重点实验室 南京210003,南京邮电大学计算机学院 南京210003,南京邮电大学计算机学院 南京210003;南京邮电大学江苏省无线传感网高技术研究重点实验室 南京210003
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61171053,9),教育部博士点基金(20113223110002),中国博士后科学基金(2014M551635),江苏省博士后科研资助

Detecting Concept Drift of Data Stream Based on Fuzzy Clustering

CHEN Xiao-dong, SUN Li-juan, HAN Chong and GUO Jian   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对数据流中可能出现的概念漂移现象,采用改进的FCM算法进行模糊聚类,提出在大小可变的滑动窗口中通过度量相邻窗口之间的差异性来判断是否发生了概念漂移,并给出了相应的处理方法。实验表明该算法能够有效地检测出数据流中的概念漂移现象,具有很好的聚类效果和很高的时间效率。

关键词: 概念漂移,数据流,模糊聚类,可变滑动窗口

Abstract: The phenomena of concept drift may occur in data stream,and how to detect it is very important in many applications.We used the improved version of FCM algorithm to cluster data in variable siding window,and measured the difference between adjacent windows to determine whether concept drift occurs.The result shows that our algorithm can detect concept drift in data stream effectively,and has great performance in clustering quality and time.

Key words: Concept drift,Data stream,Fuzzy clustering,Variable sliding window

[1] Chen Yi-xin,Li Tu.Density-Based Clustering for Real-TimeStream Data[C]∥Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.California,2007:133-142
[2] Silva J A,Faria E R,Barros R C,et al.Data stream clustering:A survey[J].ACM Computing Surveys,2013,46(1):1-13,31
[3] Dongre P B,Malik L G.A review on real time data stream classification and adapting to various concept drift scenarios[C]∥2014 IEEE International Advance Computing Conference(IACC).IEEE,Gurgaon,2014:533-537
[4] Padmalatha E,Reddy C R K,Rani B P.Classification of Concept Drift Data Streams[C]∥2014 International Conference on Information Science and Applications (ICISA).IEEE,Hainan,2014:1-5
[5] Wang H,Fan W,Yu P S,et al.Mining concept-drifting data streams using ensemble classifiers[C]∥Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.Washington,2003:226-235
[6] Aggarwal C C,Han J,Wang J,et al.A framework for clustering evolving data streams[C]∥Proceedings of the 29th Internatio-nal Conference on Very Large Data Bases-volume 29.VLDB Endowment,2003:81-92
[7] Lloyd S.Least squares quantization in PCM[J].IEEE Transactions on Information Theory,1982,28(2):129-137
[8] Gaber M M,Yu P S.Detection and classification of changes in evolving data streams[J].International Journal of Information Technology & Decision Making,2006,5(4):659-670
[9] Chen H L,Chen M S,Lin S C.Catching the trend:A framework for clustering concept-drifting categorical data[J].IEEE Transactions on Knowledge and Data Engineering,2009,21(5):652-665
[10] Li Pei-pei.Concept Drifting Detection and Classification on Data Streams [D].Hefei:Hefei University of Technology,2012(in Chinese) 李培培.数据流中概念漂移检测与分类方法研究[D].合肥:合肥工业大学,2012
[11] Cao Fu-yuan.Studies on Clustering Algorithms for CategoricalData[D].Taiyuan:Shanxi University,2010(in Chinese) 曹付元.面向分类数据的聚类算法研究[D].太原:山西大学,2010
[12] Hu Wei.Research and realization of a web information extraction and knowledge presentation system[J].Application of Computer System,2013,22(5):116-121(in Chinese) 胡伟.一种改进的动态k-均值聚类算法[J].计算机系统应用,2013,22(5):116-121
[13] Bezdek J C,Ehrlich R,Full W.FCM:The fuzzy c-means clustering algorithm[J].Computers & Geosciences,1984,10(2):191-203
[14] W Ren-xia,Y Xiao-ya,S Xiao-ke.A Weighted Fuzzy Clustering Algorithm for Data Stream[C]∥International Colloquium on Computing,Communication,Control,and Management,2008(CCCM’08).2008:360-364
[15] Jaworski M,Duda P,Pietruczuk L.On fuzzy clustering of data streams with concept drift[C]∥Artificial Intelligence and Soft Computing.Springer Berlin Heidelberg,2012:82-91
[16] Jiawei H,Micheline K,Jian P.Data Minging:Concepts andTechniques(Third Edition)[M].San Francisco:Morgan Kaufmann Publishers,2012:323-350
[17] Shi Feng,Wang Hui,Yu Lei,et al.Matlab Intelligent Algo-rithm:Analysis of 30 Cases [M].Beijing:Beihang University Press,2011:188-196(in Chinese) 史峰,王辉,郁磊,等.Matlab智能算法:30个案例分析[M] .北京:北京航天航空大学出版社,2011:188-196
[18] David A.UCI Machine Learning Repository.http://archive.ics.uci.edu/ml/datasets.html

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!