计算机科学 ›› 2019, Vol. 46 ›› Issue (1): 29-35.doi: 10.11896/j.issn.1002-137X.2019.01.005

• 综述 • 上一篇    下一篇

基于shapelet的时间序列分类研究

闫汶和, 李桂玲   

  1. (中国地质大学(武汉)计算机学院 武汉430074)
  • 收稿日期:2017-12-05 出版日期:2019-01-15 发布日期:2019-02-25
  • 作者简介:闫汶和(1995-),男,硕士生,CCF会员,主要研究方向为数据挖掘和机器学习、时间序列数据管理,E-mail:cugywh@163.com;李桂玲(1979-),女,副教授,硕士生导师,CCF会员,主要研究方向为数据挖掘与机器学习、时间序列数据管理,E-mail:guiling@cug.edu.cn(通信作者)。
  • 基金资助:
    国家自然科学基金(61702468),中国地质大学(武汉)教学实验室开放基金(SKJ2018286)资助

Research on Time Series Classification Based on Shapelet

YAN Wen-he, LI Gui-ling   

  1. (School of Computer Science,China University of Geosciences,Wuhan 430074,China)
  • Received:2017-12-05 Online:2019-01-15 Published:2019-02-25

摘要: 时间序列是随时间次序变化的高维实值数据,广泛存在于医学、金融、监控等领域。因为传统的分类算法在时间序列上的分类效果不佳且不具备可解释性,而shapelet为时间序列中最具辨别性的连续子序列,具有可解释性,所以基于shapelet的时间序列分类已成为时间序列分类研究的热点之一。首先,通过归纳总结,将现有的时间序列shapelet发现算法分为空间搜索发现shapelet和目标函数优化学习shapelet两类,并介绍了shapelet的相关应用;然后,从分类的对象出发,重点阐述了基于shapelet的一元时间序列和多元时间序列的分类算法;最后,指出了基于shapelet的时间序列分类在未来的研究方向。

关键词: shapelet, 分类, 时间序列, 特征提取

Abstract: Time series is high-dimensional real-value data changing with time order,and it appears extensively in the fields of medicine,finance,monitoring and others.Because the accuracy of conventional classification algorithms is not ideal for the time series and it doesn’t possess the characteristic of interpretability,and shapelet is a discriminative continuous time-series subsequence,the time series classification based on shapelet has become one of the hot spots in the researches on time series classification.First,through analyzing the existing time series shapelet discovery methods,this paper classified them into two catalogues,namely shapelet discovery from shapelet candidates and learning shapelet by optimizing object function,and introduced the application of shapelet.Then,according to the classification object,this paper emphasized the univariate time series classification algorithms and multivariate time series classification algorithms based on shapelet.Finally,this paper pointed out the further research direction of time series classification based on shapelet.

Key words: Classification, Feature extraction, Shapelet, Time series

中图分类号: 

  • TP391
[1]YE L,KEOGH E.Time series shapelets:a new primitive for data mining[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2009:947-956.<br /> [2]RAKTHANMANON T,KEOGH E.Fast shapelets:A scalable algorithm for discovering time series shapelets[C]//Proceedings of the 2013 SIAM International Conference on Data Mining.2013:668-676.<br /> [3]LI Z S,HE Z F.Time Series Shapelet Extraction Based on Principal Component Analysis[J].Computer Systems & Applications,2014,23(11):145-149.(in Chinese)<br /> 李祯盛,何振峰.基于主成分分析的时间序列Shapelet提取方法[J].计算机系统应用,2014,23(11):145-149.<br /> [4]CHANG K,DEKA B,HWU W M W,et al.Efficient Pattern-Based Time Series Classification on GPU[C]//IEEE International Conference on Data Mining.IEEE Computer Society,2012:131-140.<br /> [5]HE Q,DONG Z,ZHUANG F,et al.Fast Time Series Classification Based on Infrequent Shapelets[C]//International Confe-rence on Machine Learning and Applications.IEEE,2013:215-219.<br /> [6]WEI Y,JIAO L,WANG S,et al.Time Series Classification with Max-Correlation and Min-Redundancy Shapelets Transformation[C]//International Conference on Identification,Information,and Knowledge in the Internet of Things.IEEE,2016:7-12.<br /> [7]ZHANG Z,ZHANG H,WEN Y,et al.Accelerating time seriesshapelets discovery with key points[C]//Asia-Pacific Web Conference.Springer International Publishing,2016:330-342.<br /> [8]JI C,ZHAO C,PAN L,et al.A Fast Shapelet Discovery Algorithm Based on Important Data Points[J].International Journal of Web Services Research,2017,14(2):67-80.<br /> [9]KARLSSON I,PAPAPETROU P,BOSTRÖM H.Forests of Randomized Shapelet Trees[C]//International Symposium on Statistical Learning and Data Sciences.Cham:Springer,2015:126-136.<br /> [10]RENARD X,RIFQI M,ERRAY W,et al.Random-shapelet:An algorithm for fast shapelet discovery[C]//IEEE International Conference on Data Science and Advanced Analytics.IEEE,2015:1-10.<br /> [11]CETIN M,MUEEN A,CALHOUN V.Shapelet Ensemble for Multi-dimensional Time Series[C]//SDM.2015:307-315.<br /> [12]WISTUBA M,GRABOCKA J,SCHMIDT-THIEME L.Ultra-fast shapelets for time series classification[J].arXiv:1503.05018,2015.<br /> [13]LIN Y F,CHEN H,TSENG V S,et al.Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes[M]//Advances in Knowledge Discovery and Data Mining.Springer International Publishing,2015:199-211.<br /> [14]GRABOCKA J,SCHILLING N,WISTUBA M,et al.Learning time-series shapelets[C]//ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining.ACM,2014:392-401.<br /> [15]SHAH M,GRABOCKA J,SCHILLING N,et al.Learning DTW-shapelets for time-series classification[C]//Proceedings of the 3rd IKDD Conference on Data Science.ACM,2016:13-16.<br /> [16]HOU L,ZURADA J M,ZURADA J M.Efficient learning of timeseries shapelets[C]//Thirtieth AAAI Conference on Artificial Intelligence.AAAI Press,2016:1209-1215.<br /> [17]YANG Y,DENG Q,SHEN F,et al.A Shapelet Learning Method for Time Series Classification[C]//IEEE International Conference on TOOLS with Artificial Intelligence.IEEE Computer Society,2016:423-430.<br /> [18]LI X G,SONG B Y,YU G,et al.Wavelet-Based pseudo period detection on time series stream[J].Journal of Software,2010,21(9):2161-2172.(in Chinese)<br /> 李晓光,宋宝燕,于戈,等.基于小波的时间序列流伪周期检测方法[J].软件学报,2010,21(9):2161-2172.<br /> [19]ZHU L P,LU C,HUANG H,et al.Wide-Area Time Series Data Mining Based Transient Voltage Stability Assessment[J].Po-wer System Technology,2016,40(1):180-185.(in Chinese)<br /> 朱利鹏,陆超,黄河,等.基于广域时序数据挖掘策略的暂态电压稳定评估[J].电网技术,2016,40(1):180-185.<br /> [20]MOUSHEIMISH R,TAHER Y,ZEITOUNI K.autoCEP:Automatic Learning of Predictive.Rules for Complex Event Processing[C]//International Conference on Service-Oriented Computing.Springer International Publishing,2016:586-593.<br /> [21]KARLSSON I,PAPAPETROU P,ASKER L.Multi-channel ECG classification using forests of randomized shapelet trees[C]//ACM International Conference on Pervasive Technologies Rela-ted To Assistive Environments.ACM,2015:1-6.<br /> [22]YEH C C M,ZHU Y,ULANOVA L,et al.Time series joins,motifs,discords and shapelets:a unifying view that exploits the matrix profile[J].Data Mining & Knowledge Discovery,2018,32(1):83-123.<br /> [23]YE L,KEOGH E.Time series shapelets:a novel technique that allows accurate,interpretable and fast classification[J].Data Mining and Knowledge Discovery,2011,22(1):149-182.<br /> [24]MUEEN A,KEOGH E,YOUNG N.Logical-shapelets:an expressive primitive for time series classification[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2011:1154-1162.<br /> [25]DING J,WANG S Y.Incremental Time Series Classification Algorithm Based on Shapelets[J].Computer Science,2016,43(5):257-260.(in Chinese)<br /> 丁剑,王树英.一种使用shapelets的增量式时间序列分类[J].计算机科学,2016,43(5):257-260.<br /> [26]GORDON D,HENDLER D,ROKACH L.Fast randomized model generation for shapelet-based time series classification[J].arXiv:1209.5038,2012.<br /> [27]GORDON D,HENDLER D,ROKACH L.Fast and space-efficient shapelets-based time-series classification[J].Intelligent Data Analysis,2015,19(5):953-981.<br /> [28]LINES J,BAGNALL A.Alternative Quality Measures for Time Series Shapelets[M]//Intelligent Data Engineering and Automated Learning-IDEAL 2012.Springer Berlin Heidelberg,2012:475-483.<br /> [29]YAN X,MENG F,YAN Q.Shapelet classification method based on trend feature representation[J].Journal of Computer Applications,2017,37(8):2343-2348.<br /> [30]DOUZAL-CHOUAKRIA A,AMBLARD C.Classification trees for time series[J].Pattern Recognition,2012,45(3):1076-1091.<br /> [31]ARATHI M,GOVARDHAN A.An Efficient and Accurate Time Series Classification Using Shapelets[J].International Journal of Information and Electronics Engineering,2014,4(5):347-353.<br /> [32]MITZEV I,YOUNANN H.Concatenated Decision Paths Classification for Time Series Shapelets[J].International Journal of Instrumentation and Control Systems (IJICS),2016,1(6):15-25.<br /> [33]DENG H,RUNGER G,TUV E,et al.A time series forest for classification and feature extraction[J].Information Sciences,2013,239(4):142-153.<br /> [34]RAZA A,KRAMER S.Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs[J].arXiv:1702.06712,2017.<br /> [35]XING Z,PEI J,YU P S,et al.Extracting Interpretable Features for Early Classification on Time Series[C]//Eleventh Siam International Conference on Data Mining.DBLP,2011:247-258.<br /> [36]GHALWASH M F,RADOSAVLJEVIC V,OBRADOVIC Z. Utilizing temporal patterns for estimating uncertainty in interpre-table early decision making[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2014:402-411.<br /> [37]KARLSSON I,PAPAPETROU P,BOSTRÖM H.Early Random Shapelet Forest[M]//Discovery Science.Springer International Publishing,2016:261-276.<br /> [38]HILLS J,LINES J,BARANAUSKAS E,et al.Classification of time series by shapelet transformation[J].Data Mining & Knowledge Discovery,2014,28(4):851-881.<br /> [39]BOSTROM A,BAGNALL A,LINES J.Evaluating Improvements to the Shapelet Transform[C]//Knowledge Discovery in Database 2016 Time Series Workshop.KDD,2016.<br /> [40]BOSTROM A,BAGNALL A.Binary Shapelet Transform for Multiclass Time Series Classification[M]//Big Data Analytics and Knowledge Discovery.2015:257-269.<br /> [41]YUAN J D,WANG Z H,HAN M.Shapelet Pruning and Shapelet Coverage for Time Series Classification[J].Journal of Software,2015,26(9):2311-2325.(in Chinese)<br /> 原继东,王志海,韩萌.基于Shapelet剪枝和覆盖的时间序列分类算法[J].软件学报,2015,26(9):2311-2325.<br /> [42]ZALEWSKI W,SILVA F,MALETZKE A G,et al.Exploring shapelet transformation for time series classification in decision trees[J].Knowledge-Based Systems,2016,112:80-91.<br /> [43]YUAN J D,WANG Z H,HAN M.A logical shapelets transformation for time series classification[J].Chinese Journal of Computers,2015,38(7):1448-1459.(in Chinese)<br /> 原继东,王志海,韩萌.基于逻辑 shapelets 转换的时间序列分类算法[J].计算机学报,2015,38(7):1448-1459.<br /> [44]SUN Q F,YAN Q Y,YAN X M.Diversified top-k shapelets transform for time series classification[J].Journal of Computer Applications,2017,37(2):335-340.(in Chinese)<br /> 孙其法,闫秋艳,闫欣鸣.基于多样化top-k shapelets转换的时间序列分类方法[J].计算机应用,2017,37(2):335-340.<br /> [45]YAN Q,SUN Q,YAN X.Adapting ELM to Time Series Classification:A Novel Diversified Top-k Shapelets ExtractionMe-thod[C]//Australasian Database Conference.Springer International Publishing,2016:215-227.<br /> [46]HE G,DUAN Y,QIAN T,et al.Early prediction on imbalanced multivariate time series[C]//Acm International Conference on Information & Knowledge Management.ACM,2013:1889-1892.<br /> [47]GRABOCKA J,WISTUBA M,SCHMIDT-THIEME L.Fast classification of univariate and multivariate time series through shapelet discovery[J].Knowledge and Information Systems,2016,49(2):429-454.<br /> [48]WANG H,WU J.Boosting for Real-Time Multivariate Time Series Classification[C]//AAAI Conference on Artificial Intelligence (AAAI).2017:4999-5000.<br /> [49]BAYDOGAN M G,RUNGER G.Learning a symbolic representation for multivariate time series classification[J].Data Mining and Knowledge Discovery,2015,29(2):400-422.<br /> [50]PATRI O P,KANNAN R,PANANGADAN A V,et al.Multivariate time series classification using inter-leaved shapelets[C]//Time Series Workshop in Neural Information Processing Systems.NIPS,2015.<br /> [51]GHALWASH M F,RAMLJAK D,OBRADOVIC Z.Early classification of multivariate time series using a hybrid HMM/SVM model[C]//IEEE International Conference on Bioinformatics and Biomedicine.IEEE,2012:1-6.<br /> [52]GHALWASH M F,RADOSAVLJEVIC V,OBRADOVIC Z. Extraction of Interpretable Multivariate Patterns for Early Diagnostics[C]//IEEE International Conference on Data Mining.IEEE,2013:201-210.<br /> [53]HE G,DUAN Y,PENG R,et al.Early classification on multivariate time series[J].Neurocomputing,2015,149(PB):777-787.
[1] 陈志强, 韩萌, 李慕航, 武红鑫, 张喜龙.
数据流概念漂移处理方法研究综述
Survey of Concept Drift Handling Methods in Data Streams
计算机科学, 2022, 49(9): 14-32. https://doi.org/10.11896/jsjkx.210700112
[2] 周旭, 钱胜胜, 李章明, 方全, 徐常胜.
基于对偶变分多模态注意力网络的不完备社会事件分类方法
Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification
计算机科学, 2022, 49(9): 132-138. https://doi.org/10.11896/jsjkx.220600022
[3] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[4] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[5] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[6] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[7] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[8] 高振卓, 王志海, 刘海洋.
嵌入典型时间序列特征的随机Shapelet森林算法
Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features
计算机科学, 2022, 49(7): 40-49. https://doi.org/10.11896/jsjkx.210700226
[9] 杨炳新, 郭艳蓉, 郝世杰, 洪日昌.
基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用
Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition
计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
[10] 张洪博, 董力嘉, 潘玉彪, 萧宗志, 张惠臻, 杜吉祥.
视频理解中的动作质量评估方法综述
Survey on Action Quality Assessment Methods in Video Understanding
计算机科学, 2022, 49(7): 79-88. https://doi.org/10.11896/jsjkx.210600028
[11] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[12] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[13] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[14] 杜丽君, 唐玺璐, 周娇, 陈玉兰, 程建.
基于注意力机制和多任务学习的阿尔茨海默症分类
Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning
计算机科学, 2022, 49(6A): 60-65. https://doi.org/10.11896/jsjkx.201200072
[15] 李小伟, 舒辉, 光焱, 翟懿, 杨资集.
自然语言处理在简历分析中的应用研究综述
Survey of the Application of Natural Language Processing for Resume Analysis
计算机科学, 2022, 49(6A): 66-73. https://doi.org/10.11896/jsjkx.210600134
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!