计算机科学 ›› 2024, Vol. 51 ›› Issue (2): 36-46.doi: 10.11896/jsjkx.230100135
乔帆1, 王鹏2, 汪卫2
QIAO Fan1, WANG Peng2, WANG Wei2
摘要: 随着大数据时代的到来和传感器的发展,多维时间序列分类问题成为数据挖掘领域的重要问题。多维时间序列存在维度高、维度间关系复杂、数据形态多变的特点,从而生成巨大的特征空间。现有方法难以选取有区分力的特征,导致方法的准确度普遍较低。另一方面,现有方法的分类结果的可解释性较差。针对上述问题,提出了一种基于异构特征融合的多维时间序列分类算法。该算法融合了时域、频域和区间统计值这3种特征并对特征进行聚类,从而找到最有代表性的特征。首先为每个维度提取不同类型的代表性特征,再通过多维度特征转换的方法融合所有维度的不同类型的特征,形成特征向量,并基于此训练分类模型。为了提高分类结果的可解释性,算法基于树结构生成不同类型的候选特征集合,然后通过聚合消除冗余和相似的特征,最终获得少量代表性特征。为了验证所提算法的有效性,在公开的UEA数据集上进行了大量实验。实验结果显示,所提算法的准确性、特征融合的合理性,以及分类结果的可解释性均优于现有方法。
中图分类号:
[1]SEZER O B,GUDELEK M U,OZBAYOGLU A M.Financial time series forecasting with deep learning:A systematic literature review:2005-2019[J].Applied Soft Computing,2020,90:106181. [2]QI H,XIAO S,SHI R,et al.COVID-19 transmission in Main-land China is associated with temperature and humidity:A time-series analysis[J].Science of the Total Environment,2020,728:138778. [3]RUßWURM M,KÖRNER M.Self-attention for raw optical sa-tellite time series classification[J].ISPRS Journal Of Photogrammetry and Remote Sensing,2020,169:421-435. [4]SILVA D F,GIUSTI R,KEOGH E,et al.Speeding up similarity search under dynamic time warping by pruning unpromisingalignments[J].Data Mining and Knowledge Discovery,2018,32(4):988-1016. [5]YE L,KEOGH E.Time series shapelets:a new primitive for data mining[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2009:947-956. [6]LE NGUYEN T,GSPONER S,IFRIM G.Time series classification by sequence learning in all-subsequence space[C]//2017 IEEE 33rd International Conference on Data Engineering(ICDE).IEEE,2017:947-958. [7]LI G,CHOI B,XU J,et al.Shapenet:A shapelet-neural network approach for multivariate time series classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021,35(9):8375-8383. [8]LIN J,KEOGH E,WEI L,et al.Experiencing SAX:a novelsymbolic representation of time series[J].Data Mining and Knowledge Discovery,2007,15(2):107-144. [9]SCHÄFER P,HÖGQVIST M.SFA:a symbolic fourier approximation and index for similarity search in high dimensional datasets[C]//Proceedings of the 15th International Conference on Extending Database Technology.2012:516-527. [10]DENG H,RUNGER G,TUV E,et al.A time series forest for classification and feature extraction[J].Information Sciences,2013,239:142-153. [11]LINES J,TAYLOR S,BAGNALL A.Hive-cote:The hierarchical vote collective of transformation-based ensembles for time series classification[C]//2016 IEEE 16th International Confe-rence on Data MiningICDM).IEEE,2016:1041-1046. [12]MIDDLEHURST M,LARGE J,FLYNN M,et al.HIVE-COTE 2.0:a new meta ensemble for time series classification[J].Machine Learning,2021,110(11):3211-3243. [13]SHIFAZ A,PELLETIER C,PETITJEAN F,et al.TS-CHIEF:a scalable and accurate forest algorithm for time series classification[J].Data Mining and Knowledge Discovery,2020,34(3):742-775. [14]ISMAIL FAWAZ H,FORESTIER G,WEBER J,et al.Deeplearning for time series classification:a review[J].Data Mining and Knowledge Discovery,2019,33(4):917-963. [15]RUIZ A P,FLYNN M,LARGE J,et al.The great multivariate time series classification bake off:a review and experimental evaluation of recent algorithmic advances[J].Data Mining and Knowledge Discovery,2021,35(2):401-449. [16]ZHANG X,GAO Y,LIN J,et al.Tapnet:Multivariate time series classification with attentional prototypical network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(4):6845-6852. [17]KARLSSON I,PAPAPETROU P,BOSTRÖM H.Generalized random shapelet forests[J].Data Mining and Knowledge Discovery,2016,30(5):1053-1085. [18]SHOKOOHI-YEKTA M,WANG J,KEOGH E.On the non-trivial generalization of dynamic time warping to the multi-dimensional case[C]//Proceedings of the 2015 SIAM Interna-tional Conference on Data Mining.Society for Industrial and Applied Mathematics.2015:289-297. [19]WISTUBA M,GRABOCKA J,SCHMIDT-THIEME L.Ultra-fast shapelets for time series classification[J].arXiv:1503.05018,2015. [20]BAYDOGAN M G,RUNGER G.Time series representation and similarity based on local autopatterns[J].Data Mining and Knowledge Discovery,2016,30:476-509. [21]BAYDOGAN M G,RUNGER G.Learning a symbolic representation for multivariate time series classification[J].Data Mining and Knowledge Discovery,2015,29:400-422. [22]SCHÄFER P,LESER U.Multivariate time series classification with WEASEL+ MUSE[J].arXiv:1711.11343,2017. [23]MIDDLEHURST M,LARGE J,BAGNALL A.The canonicalinterval forest(CIF) classifier for time series classification[C]//2020 IEEE International Conference on Big Data.IEEE,2020:188-195. [24]LUBBA C H,SETHI S S,KNAUTE P,et al.catch22:Canonical time-series characteristics[J].Data Mining and Knowledge Discovery,2019,33(6):1821-1852. [25]KARIM F,MAJUMDAR S,DARABI H,et al.MultivariateLSTM-FCNs for time series classification[J].Neural Networks,2019,116:237-245. [26]ZHENG Y,LIU Q,CHEN E,et al.Time series classificationusing multi-channels deep convolutional neural networks[C]//International conference on web-age information management.Springer International Publishing,2014:298-310. [27]TUNCEL K S,BAYDOGAN M G.Autoregressive forests for multivariate time series modeling[J].Pattern Recognition,2018,73:202-215. [28]FRANCESCHI J Y,DIEULEVEUT A,JAGGI M.Unsuper-vised scalable representation learning for multivariate time series[J].arXiv:1901.10738,2019. [29]BAGNALL A,DAU H A,LINES J,et al.The UEA multiva-riate time series classification archive,2018[J].arXiv:1811.00075,2018. [30]DEMPSTER A,PETITJEAN F,WEBB G I.ROCKET:exceptionally fast and accurate time series classification using random convolutional kernels[J].Data Mining and Knowledge Discove-ry,2020,34(5):1454-1495. [31]DEMPSTER A,SCHMIDT D F,WEBB G I.Minirocket:A very fast(almost) deterministic transform for time series classification[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:248-257. [32]LARGE J,KEMSLEY E K,WELLNER N,et al.Detectingforged alcohol non-invasively through vibrational spectroscopy and machine learning[C]//Pacific-Asia Conferenceon Know-ledge Discovery and Data Mining.Cham:Springer,2018:298-309. [33]VILLAR J R,VERGARA P,MENÉNDEZ M,et al.Generalized models for the classification of abnormal movements in daily life and its applicability to epilepsy convulsion recognition[J].International Journalof Neural Systems,2016,26(6):1650037. |
|