Computer Science ›› 2019, Vol. 46 ›› Issue (11): 251-259.doi: 10.11896/jsjkx.191100505C

• Graphics ,Image & Pattern Recognition • Previous Articles     Next Articles

Time Series Motif Discovery Algorithm of Variable Length Based on Domain Preference

WANG Yi-bo1,2, PENG Guang-ju1,2, HE Yuan-duo1,2, WANG Ya-sha1,3, ZHAO Jun-feng1,2, WANG Jiang-tao1,2   

  1. (Key Lab of High Confidence Software Technologies(Peking University),Ministry of Education,Beijing 100871,China)1
    (School of Electronics Engineering and Computer Science,Peking University,Beijing 100871,China)2
    (National Engineering Research Center for Software Engineering,Peking University,Beijing 100871,China)3
  • Received:2018-10-03 Online:2019-11-15 Published:2019-11-14

Abstract: With the development of ubiquitous computing,more and more sensors are installed in our daily applications.As a result,the demand for time series data processing is very high.The similar pattern which appears in time series data several times are called time series motif.Motif contains huge amounts of information in time series data.Motif discovery is one of the most important work in motif analysis.State-of-art motif discovery algorithm cannot find proper motif based on domain knowledge.As a result,such algorithm cannot find most valuable motif.Aiming at this problem,this paper used domain distance to evaluate the similarities of subsequences based on domain knowledge.By using the new distance,this paper developed a branching method to discovery motif with variable length.Several data from real life are used to test the performance of the algorithm.The results show that the proposed algorithm can find motif with domain knowledge accurately.

Key words: Domain knowledge, Motif, Motif example, Time series data, Variable time window

CLC Number: 

  • TP274
[1]BOX G E P,JENKINS G M,REINSEL G C,et al.Time series analysis:forecasting and control[M].New Jersey:John Wiley & Sons,2015.
[2]PATEL P,KEOGH E,LIN J,et al.Mining motifs in massive time series databases[C]∥Proceedings of 2002 IEEE International Conference on Data Mining.IEEE,2002:370-377.
[3]LONARDI J,PATEL P.Finding motifs in time series[C]∥Proceedings of the 2nd Workshop on Temporal Data Mining.2002:53-68.
[4]WANG H,ZHANG D,WANG Y,et al.RT-Fall:A real-time and contactless fall detection system with commodity WiFi devices[J].IEEE Transactions on Mobile Computing,2017,16(2):511-526.
[5]BROWN A E X,YEMINI E I,GRUNDY L J,et al.A dictionary of behavioral motifs reveals clusters of genes affecting Caenorhabditis elegans locomotion[J].Proceedings of the National Academy of Sciences,2013,110(2):791-796.
[6]LIN J,KEOGH E,FU A,et al.Approximations to magic:Finding unusual medical time series[C]∥18th IEEE Symposium on Computer-Based Medical Systems (CBMS’05).IEEE,2005:329-334.
[7]BARRENETXEA G,INGELREST F,SCHAEFER G,et al.Sensorscope:Out-of-the-box environmental monitoring[C]∥Proceedings of the 7th International Conference on Information Processing in Sensor Networks.IEEE Computer Society,2008:332-343.
[8]MCGOVERN A,ROSENDAHL D H,BROWN R A,et al.Identifying predictive multi-dimensional time series motifs:an application to severe weather prediction[J].Data Mining and Know-ledge Discovery,2011,22(1-2):232-258.
[9]SHOKOOHI-YEKTA M,CHEN Y,CAMPANA B,et al.Discovery of meaningful rules in time series[C]∥Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:1085-1094.
[10]KEOGH E,KASETTY S.On the need for time series datamining benchmarks:a survey and empirical demonstration[J].Data Mining and Knowledge Discovery,2003,7(4):349-371.
[11]MUEEN A,KEOGH E,ZHU Q,et al.Exact discovery of time series motifs[C]∥Proceedings of the 2009 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2009:473-484.
[12]YEH C C M,ZHU Y,ULANOVA L,et al.Matrix profile I:all pairs similarity joins for time series:a unifying view that includes motifs,discords and shapelets[C]∥2016 IEEE 16th international conference on data mining (ICDM).IEEE,2016:1317-1322.
[13]ZHU Y,ZIMMERMAN Z,SENOBARI N S,et al.Matrix profile ii:Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins[C]∥2016 IEEE 16th International Conference on Data Mining (ICDM).IEEE,2016:739-748.
[14]MUEEN A,CHAVOSHI N.Enumeration of time series motifs of all lengths[J].Knowledge and Information Systems,2015,45(1):105-132.
[15]DAU H A,KEOGH E.Matrix Profile V:A Generic Technique to Incorporate Domain Knowledge into Motif Discovery[C]∥Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2017:125-134.
[16]MAKONIN S,ELLERT B,BAJIC' I V,et al.Electricity,water,and natural gas consumption of a residential house in Canada from 2012 to 2014[J].Scientific Ata,2016,3:160037-160037.
[17]KUBÁNEK J,MILLER K J,OJEMANN J G,et al.DecodingFlexion Of Individual Fingers Using electrocorticographic signals in humans[J].Journal of Neural Engineering,2009,6(6):066001-066001.
[1] LIANG Jing-ru, E Hai-hong, Song Mei-na. Method of Domain Knowledge Graph Construction Based on Property Graph Model [J]. Computer Science, 2022, 49(2): 174-181.
[2] DING Wu, MA Yuan, DU Shi-lei, LI Hai-chen, DING Gong-bo, WANG Chao. Mining Trend Similarity of Multivariate Hydrological Time Series Based on XGBoost Algorithm [J]. Computer Science, 2020, 47(11A): 459-463.
[3] LU Xin-yun, WANG Xing-fen. Educational Administration Data Mining of Association Rules Based on Domain Association Redundancy [J]. Computer Science, 2019, 46(6A): 427-430.
[4] JI Hai-juan, ZHOU Cong-hua, LIU Zhi-feng. Symbolic Aggregate Approximation Method of Time Series Based on Beginning and End Distance [J]. Computer Science, 2018, 45(6): 216-221.
[5] LENG Li-hua, LIAO Yi-jie and LIAO Hong-zhi. Analysis of Influence of Domain Knowledge on Development of Big Data [J]. Computer Science, 2017, 44(Z6): 48-49.
[6] YANG Ying-hui, LI Jian-hua, NAN Ming-li, CUI Qiong and WANG Hong. Networked Operational Information Flowing Dynamic Hypergraph Model Based on Motif [J]. Computer Science, 2016, 43(8): 30-35.
[7] HOU Cang-jian,CHEN Ling,LV Ming-qi and CHEN Gen-cai. Acceleration-based Activity Recognition Independent of Device Orientation and Placement [J]. Computer Science, 2014, 41(10): 76-79.
[8] LI Hai-lin and YANG Li-bin. Similarity Measure for Time Series Based on Incremental Dynamic Time Warping [J]. Computer Science, 2013, 40(4): 227-230.
[9] . Research on Application-driven Parallel Program Performance Tuning [J]. Computer Science, 2013, 40(1): 49-53.
[10] . Efficient and Scalable Parallel Algorithm for Motif Finding on Heterogeneous Cluster Systems [J]. Computer Science, 2012, 39(3): 279-282.
[11] YAN Lei-ming,SUN Zhi-hui,ZHANG Bai-li,YANG Ming,YAO Pei. Mining Closed Composite Sequential Patterns Efficiently [J]. Computer Science, 2010, 37(6): 186-190.
[12] . [J]. Computer Science, 2008, 35(8): 188-194.
[13] WANG Xiao-Dong, GUO Lei ,FANG Jun (College of Automation, Northwestern Polyteehnieal University, Xi'an 710072). [J]. Computer Science, 2008, 35(3): 142-145.
[14] . [J]. Computer Science, 2006, 33(8): 187-189.
[15] . [J]. Computer Science, 2005, 32(10): 132-134.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!