计算机科学 ›› 2022, Vol. 49 ›› Issue (4): 144-151.doi: 10.11896/jsjkx.210600045

• 数据库&大数据&数据科学 • 上一篇    下一篇

多元时序上状态转移模式的三支漂移检测

沈少朋, 马洪江, 张智恒, 周相兵, 朱春满, 温佐承   

  1. 成都信息工程大学软件工程学院 成都 610225
  • 收稿日期:2021-06-04 修回日期:2021-09-24 发布日期:2022-04-01
  • 通讯作者: 张智恒(zhihengzhang406@163.com)
  • 作者简介:(ssp8471@163.com)
  • 基金资助:
    国家自然科学基金(41604114,62006200); 教育部产学研协同育人项目(201902298010); 四川省科技计划项目(2020YFG0307); 成都市重点研发支撑计划(2021-YF05-00933-SN); 四川旅游学院科研项目(2020SCTU14,19SCTUZY03)

Three-way Drift Detection for State Transition Pattern on Multivariate Time Series

SHEN Shao-peng, MA Hong-jiang, ZHANG Zhi-heng, ZHOU Xiang-bing, ZHU Chun-man, WEN Zuo-cheng   

  1. School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China
  • Received:2021-06-04 Revised:2021-09-24 Published:2022-04-01
  • About author:SHEN Shao-peng,born in 1993,postgraduate,is a member of China Computer Federation.His main research interests include reinforcement learning and anomaly detection.ZHANG Zhi-heng,born in 1990,Ph.D,is a member of China Computer Federation.His main research interests include time-series analysis,three-way decision and cost-sensitive learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(41604114,62006200),Ministry of Education Industry-University-Research Collaborative Education Project(201902298010),Sichuan Science and Technology Department Project(2020YFG0307),Chengdu Key R&D Support Plan(2021-YF05-00933-SN) and Sichuan Tourism University Scientific Research Project(2020SCTU14,19SCTUZY03).

摘要: 多元时序数据上的无监督模式漂移检测是机器学习领域的一个研究热点。然而,对模式及其漂移现象的定义十分灵活,使得该任务的难度较高。受“三分而治”思想启发,文中提出了一种基于FUP-STAP增量挖掘的、针对带通配符区间的状态转移模式的三支漂移检测算法(Three-Way Drift Detection Method for State Transition pAttern with Periodic Wildcard Gaps,3WDD-STAP),它由状态转移模式(STAP)的增量算法改进而来。在不使用额外参数的情况下,3WDD-STAP可同时获得频繁的以及发生漂移的STAP。根据增量前后的支持度变化情况,模式漂移被定义为3类:I类漂移表示本来频繁的STAP在增量后变得不频繁,需扫描增量数据集;II类漂移表示本来不频繁的STAP在增量后变得频繁,需扫描原始数据集;III类漂移表示STAP在增量后维持了频繁或者不频繁,视为正常,不扫描数据集。在空气质量与石油工程设备监控两个真实数据上的实验结果表明:1)αβ的值越大,两类漂移模式的数量越少,反之亦然;2)I类漂移的STAP在不同数据集上服从不同分布;3)所得STAP模式及其漂移现象均有很强的可读性。

关键词: 多元时序, 漂移监测, 三分而治, 序列模式发现, 增量学习

Abstract: Unsupervised drift detection for multivariate time series (MTSs) is an important task in machine learning.However, this issue is challenging because the definitions of sequential patterns and their drifts are very flexible.Inspired by the idea of “Think in Threes”, this paper proposes a three-way drift detection method for state transition pattern with periodic wildcard gaps (3WDD-STAP), which is improved from the incremental mining algorithm of STAP.Without additional parameters, both frequent and drifted STAPs can be obtained simultaneously.Considering the support changes around the increments, we define three types of STAP drift.Type I drift indicates that STAPs change from frequent to infrequent.The incremental dataset needs to be rescanned.Type II drift indicates that STAPs change from infrequent to frequent.The original dataset needs to be rescanned.Type III drift indicates that STAPs retain frequent or infrequent, namely, these STAPs are normal.No dataset needs to be rescanned.Finally, experimental results on 2 real-world datasets show that:1)we obtain less drifted STAPs with less α and β, and vice versa;2)the two types of drifted STAPs obeys different distribution for various datasets;3)the obtained STAPs and their drifts have strong readability.

Key words: Anomaly detection, Incremental learning, Multivariate time series, Sequential pattern discovery, Think in Threes

中图分类号: 

  • TP391
[1] PAWLAK Z.Rough sets[J].International Journal of Computer &Information Sciences,1982,11(5):341-356.
[2] YAO Y Y.Three-way decisions and cognitive computing[J].Cognitive Computation,2016,8(4):543-554.
[3] YAO Y Y.The geometry of three-way decision[J/OL].Applied Intelligence,2021:1-28.https://doi.org/10.1007/s10489-020-02142-z.
[4] LI J H,HUANG C C,QI J J,et al.Three-way cognitive concept learning via multi-granularity[J].Information Sciences,2017,378:244-263.
[5] MAOH,ZHAO S F,YANG L Z.Relationships between three-way concepts and classical concepts[J].Journal of Intelligent & Fuzzy Systems,2018,35(1):1063-1075.
[6] DENG X F,YAO Y Y.Decision-theoretic three-way approximations of fuzzy sets[J].Information Sciences,2014,279:702-715.
[7] YAO Y Y.Interval sets and three-way concept analysis in incomplete contexts[J].International Journal of Machine Lear-ning and Cybernetics,2017,8(1):3-20.
[8] FANG Y,MIN F.Cost-sensitive approximate attribute reduction with three-way decisions[J].International Journal of Approximate Reasoning,2019,104:148-165.
[9] MIN F,LIU F L,WEN L Y,et al.Tri-partition cost-sensitive active learning through kNN[J].Soft Computing,2019,23(5):1557-1572.
[10] YE X,LIU D.An interpretable sequential three-way recommendation based on collaborative topic regression[J/OL].Expert Systems with Applications,2021,168.https://doi.org/10.1016/j.eswa.2020.114454.
[11] ZHANG H R,MIN F,SHI B.Regression-based three-way re-commendation[J].Information Sciences,2017,378:444-461.
[12] MIN F,ZHANG S M,CIUCCI D,et al.Three-way active lear-ning through clustering selection[J].International Journal of Machine Learning and Cybernetics,2020,11(5):1033-1046.
[13] YUE X D,CHEN Y F,MIAO D Q,et al.Tri-partition neighborhood covering reduction for robust classification[J].Interna-tional Journal of Approximate Reasoning,2017,83:371-384.
[14] YU H,WANG X C,WANG G Y,et al.An active three-wayclustering method via low-rank matrices for multi-view data[J].Information Sciences,2020,507:823-839.
[15] MIN F,ZHANG Z H,ZHAI W J,et al.Frequent pattern disco-very with tri-partition alphabets[J].Information Sciences,2020,507:715-732.
[16] LI H X,ZHANG L B,HUANG B,et al.Sequential three-way decision and granulation for cost-sensitive face recognition[J].Knowledge-Based Systems,2016,91:241-251.
[17] REN R S,WEI L.The attribute reductions of three-way concept lattices[J].Knowledge-based systems,2016,99:92-102.
[18] ZHOU B,YAO Y Y,LUO J G.Cost-sensitive three-way email spam filtering[J].Journal of Intelligent Information Systems,2014,42(1):19-45.
[19] ZHUANG D E H,LI G C L,WONG A K C.Discovery of temporal associations in multivariate time series[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(12):2969-2982.
[20] ZHANG Z H,MIN F.Frequent state transition patterns of multivariate time series[J].IEEE Access,2019,7:142934-142946.
[21] ZENG S C,ZHANG Z H,MIN F,et al.A three-way incremental updating method of state transition pattern[J].Journal of Zhengzhou University (Natural Science Edition),2020,52(1):16-23.
[22] MIN F,WU Y X,WU X D.The Apriori property of sequence pattern mining with wildcard gaps[J].International Journal of Functional Informatics and Personalized Medicine,2012,4(1):15-31.
[23] WU X D,ZHU X Q,HE Y,et al.PMBC:pattern mining from biological sequences with wildcard constraints[J].Computers in Biology and Medicine,2013,43(5):481-492.
[24] WU Y X,TONG Y,ZHU X Q,et al.NOSEP:Nonoverlapping sequence pattern mining with gap constraints[J].IEEE Tran-sactions on Cybernetics,2017,48(10):2809-2822.
[25] QIAN Y K,CHEN M,YE L X,et al.Network-wide anomaly detection method based on multiscale principal component analysis[J].Journal of Software,2012 (2):361-377.
[26] ZHOU D H,WEI M H,SI X S.A survey on anomaly detection,life prediction and maintenance decision for industrial processes[J].Acta Automatica Sinica,2013,39(6):711-722.
[27] MAO J L,JIN C Q,ZHANG Z G,et al.Anomaly detection for trajectory big data:advancements and framework[J].Journal of Software,2017,28(1):17-34.
[28] YOU C C,FENG X P,LIU L J,et al.An abnormal chest X-ray diagnostic report detection method based on topic model[J].Computer Engineering & Science,2020,42(4),741-748.
[29] MEI Y D,CHEN X,SUN Y Z,et al.A method for software system anomaly detection based on log in formation and CNN-text[J].Chinese Journal of Computers,2020,43(2):366-380.
[30] CHU G,HU X G,ZHANG Y H.Semantic-based Concept Drift Detection Algorithm for Text Data Stream[J].Computer Engineering,2018,44(2):24-30.
[31] ZHOU Y J,XU C,LI J G.Unsupervised anomaly detectionmethod based on improved CURE clustering algorithm[J].Journal on Communications,2010,31(7):4-23.
[32] LI N,GUO G D,CHEN L F.Concept drift detection method with limited amount of labeled data[J].Journal of Computer Applications,2012,32(8):2176-2185.
[33] CHENG G,QIAN D X,GUO J W,et al.A classification ap-proach based on divergence for network traffic in presence of concept drift[J].Journal of Computer Research and Development,2020,57(12):2673-2682.
[34] HU M,BAI X,XU W,et al.Review of anomaly detection algorithms for multidimensional time series[J].Journal of Computer Applications,2020,40(6) 1553-1564.
[35] LIAN Y F,DAI Y X,WANG H.Anomaly detection of user behaviors based on profile mining[J].Chinese Journal of Compu-ters,2002,25(3):325-330.
[36] TIAN X G,GAO L Z,SUN C L,et al.Anomaly detection ofprogram behaviors based on system calls and homogeneous markov chain models[J].Journal of Computer Research and Development,2007(9):1538-1544.
[37] XIAO H,HU Y F.Data mining based on segmented time warping distance in time series database[J].Journal of Computer Research and Development,2005,42(1):72-78.
[38] KEOGH E,LONARDI S,CHIU W.Finding Surprising Patterns in a Time Series Database In Linear Time and Space[C]//Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2002:550-556.
[39] YU B J,XIA Z G,WANG J L.Anomaly detection algorithm based on gaussian process model[J].Computer Engineering and Design,2016,37(4):914-920.
[1] 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉.
基于边框距离度量的增量目标检测方法
Incremental Object Detection Method Based on Border Distance Measurement
计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[2] 桑彬彬, 杨留中, 陈红梅, 王生武.
优势关系粗糙集增量属性约简算法
Incremental Attribute Reduction Algorithm in Dominance-based Rough Set
计算机科学, 2020, 47(8): 137-143. https://doi.org/10.11896/jsjkx.190700188
[3] 刘凌云, 钱辉, 邢红杰, 董春茹, 张峰.
一种基于Q-学习算法的增量分类模型
Incremental Classification Model Based on Q-learning Algorithm
计算机科学, 2020, 47(8): 171-177. https://doi.org/10.11896/jsjkx.190600150
[4] 李愚, 柴国钟, 卢纯福, 唐智川.
基于增量自适应学习的在线肌电手势识别
On-line sEMG Hand Gesture Recognition Based on Incremental Adaptive Learning
计算机科学, 2019, 46(4): 274-279. https://doi.org/10.11896/j.issn.1002-137X.2019.04.043
[5] 赵中堂, 郑小东.
特征增量极限学习机
Feature Incremental Extreme Learning Machine
计算机科学, 2019, 46(11A): 112-116.
[6] 夏俊, 刘军发, 蒋鑫龙, 陈益强.
针对设备差异性问题的增量式室内定位方法
Incremental Indoor Localization for Device Diversity Issues
计算机科学, 2018, 45(10): 69-77. https://doi.org/10.11896/j.issn.1002-137X.2018.10.014
[7] 姚明海,林宣民,王宪保.
一种基于局部敏感哈希的SVM快速增量学习算法
Fast Incremental Learning Algorithm of SVM with Locality Sensitive Hashing
计算机科学, 2017, 44(Z11): 88-91. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.017
[8] 孙静,蔡希彪,姜小燕,孙福明.
基于图正则化和稀疏约束的增量型非负矩阵分解
Graph Regularized and Incremental Nonnegative Matrix Factorization with Sparseness Constraints
计算机科学, 2017, 44(6): 298-305. https://doi.org/10.11896/j.issn.1002-137X.2017.06.053
[9] 郝运河,张浩峰.
基于双支持向量回归机的增量学习算法
Incremental Learning Algorithm Based on Twin Support Vector Regression
计算机科学, 2016, 43(2): 230-234. https://doi.org/10.11896/j.issn.1002-137X.2016.02.048
[10] 刘芳,李天瑞.
一种基于概率粗糙集的属性约简加速算法
Accelerated Attribute Reduction Algorithm Based on Probabilistic Rough Sets
计算机科学, 2016, 43(12): 63-70. https://doi.org/10.11896/j.issn.1002-137X.2016.12.011
[11] 徐久成,刘洋洋,杜丽娜,孙林.
基于三支决策的支持向量机增量学习方法
Three-way Decisions-based Incremental Learning Method for Support Vector Machine
计算机科学, 2015, 42(6): 82-87. https://doi.org/10.11896/j.issn.1002-137X.2015.06.019
[12] 左万利,韩佳育,刘 露,王 英,彭 涛.
基于人工免疫算法的增量式用户兴趣挖掘
Incremental User Interest Mining Based on Artificial Immune Algorithm
计算机科学, 2015, 42(5): 34-41. https://doi.org/10.11896/j.issn.1002-137X.2015.05.007
[13] 王万良,蔡竞.
稀疏约束下非负矩阵分解的增量学习算法
Incremental Learning Algorithm of Non-negative Matrix Factorization with Sparseness Constraints
计算机科学, 2014, 41(8): 241-244. https://doi.org/10.11896/j.issn.1002-137X.2014.08.051
[14] 张一凡,冯爱民,张正林.
支持向量回归增量学习
Incremental Learning with Support Vector Regression
计算机科学, 2014, 41(6): 166-170. https://doi.org/10.11896/j.issn.1002-137X.2014.06.032
[15] 胡蓉,徐蔚鸿.
一种带修剪的增量极速学习模糊神经网络
Pruned Incremental Extreme Leaning Machine Fuzzy Neural Network
计算机科学, 2013, 40(5): 279-282.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!