计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 247-260.doi: 10.11896/jsjkx.230800146
王少鹏1,2,3,4, 冯淳恺1
WANG Shaopeng1,2,3,4, FENG Chunkai1
摘要: 挖掘时间序列中的全链集是一个新兴领域。据了解,当前并无多尺度最近时间序列的全链集挖掘算法存在。对多尺度最近时间序列下全序链集的挖掘问题进行研究,在现有LRSTOMP和ALLC算法的基础上提出了一种具有增量计算特性的挖掘算法MTSC(Mining Time Series All-Chain Sets over Multi-scale Nearest Time Series,MTSC)。该算法依次使用LRSTOMP与ALLC算法对第一个最近时间序列成员内容进行处理,得到该成员上的全序链集挖掘结果,同时保留该成员相关的PL和PR结构。从第二个最近时间序列成员开始,MTSC算法中的LRSTOMP过程只需要处理当前最近时间序列成员相对于前一个最近时间序列成员的新增部分,进一步结合前一个最近时间序列成员上的PL和PR,可以增量获得当前最近时间序列成员上的PL和PR结构,在此基础上使用ALLC算法得到该成员上的全序链集挖掘结果。相较于对每一个最近时间序列成员内容都使用LRSTOMP和ALLC算法处理的Naive方式,MTSC算法利用增量计算的思想,避免了对全部数据进行重复性计算,从而加快了算法的执行速度,具有更高的时间效率。基于公有数据样本Penguin和TiltABP的仿真实验验证了该算法的有效性,实验结果表明其性能与Naive算法完全一致,且对于以上数据样本,在空间开销增加1.1%~9.7%的情况下,可以实现时间效率80%~88.3%的提升。
中图分类号:
[1]LI Z X,LIU H Y.Combining Global and Sequential Patterns for Multivariate Time Series Forecasting [J].Chinese Journal of Computers,2023,46(1):70-84. [2]PATEL P,KEOGH E,LIN J,et al.Mining Motifs in Massive Time Series Databases[C]//International Conference on Data Mining.IEEE Computer Society,2002:370-377. [3]MORI U,MENDIBURU A,KEOGH E,et al.Reliable EarlyClassification of Time Series Based on Discriminating the Classes over time[J].Data Mining & Knowledge Discovery,2016,31(1):1-31. [4]DAU H A,BEGUM N,KEOGH E.Semi-Supervision Dramatically Improves Time Series Clustering under Dynamic Time Warping[C]//International Conference on Information and Know-ledge Management.Association for Computing Machinery,2016:999-1008. [5]HAO M C,MARWAH M,JANETZKO H,et al.Visual Explo-ration of Frequent Patterns in Multivariate Time Series[J].Information Visualization,2012,11(1):71-83. [6]SHOKOOHI-YEKTA M,CHEN Y P,CAMPANA B,et al.Dis-covery of Meaningful Rules in Time Series[C]//International Conference on Knowledge Discovery & Data Mining.Association for Computing Machinery,2015:1085-1094. [7]ZHU Y,IMAMURA M,NIKOVSKI D,et al.Matrix ProfileVII:Time Series Chains:A New Primitive for Time Series Data Mining[C]//International Conference on Data Mining.IEEE,2017:695-704. [8]HELDT T,OEFINGER M B,HOSHIYAMA M,et al.Circulatory Response to Passive and Active Changes in Posture[C]//Computers in Cardiology.IEEE,2003:263-266. [9]ZHU Y,IMAMURA M,NIKOVSKI D.et al.Introducing Time Series Chains:A New Primitive for Time Series Data Mining[J].Knowledge and Information Systems,2019,60(2):1135-1161. [10]TIAN X Z,HU A GU S Y.Research on Stock Market Prediction Method Based on Time Series Chain[J].Journal of Zhejiang University of Technology,2021,49(5):503-510,563. [11]ZHENG X W,ZOU X N,REN X X,et al.ECG Prediction Based on Bidirectional Time Series Chain Discovery Algorithm[C]//International Conference on Crowd Science and Engineering.Association for Computing Machinery,2021:95-102. [12]ZHANG L,PATEL N,LI X Q,et al.Joint Time Series Chain:Detecting Unusual Evolving Trend Across Time Series[C]//2022 SIAM International Conference on Data Mining.2022:208-216. [13]IMAMURA M,NAKAMURA T,KEOGH E.MatrixPro-fileXXI:A Geometric Approach to Time Series Chains Improves Robustness[C]//The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD'20).Association for Computing Machinery,2020:1114-1122. [14]ZHU Y,ZIMMERMAN Z,SENOBARI N S,et al.Matrix Profile II:Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins[C]//IEEE International Conference on Data Mining.IEEE,2016:739-748. [15]WANG S P,YUAN Y,LI H.Discovering All-Chain Set inStreaming Time Series[C]//The 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining.Springer Verlag,2019:306-318. |
|