计算机科学 ›› 2014, Vol. 41 ›› Issue (8): 125-129.doi: 10.11896/j.issn.1002-137X.2014.08.028

• 2013年全国理论计算机科学学术年会 • 上一篇    下一篇

不确定时间序列的统计降维方法

肖瑞,刘国华,陈爱东,宋转   

  1. 东华大学计算机科学与技术学院 上海201620;东华大学计算机科学与技术学院 上海201620;东华大学计算机科学与技术学院 上海201620;东华大学计算机科学与技术学院 上海201620
  • 出版日期:2018-11-14 发布日期:2018-11-14

Statistic Reduction for Uncertain Time Series

XIAO Rui,LIU Guo-hua,CHEN Ai-dong and SONG Zhuan   

  • Online:2018-11-14 Published:2018-11-14

摘要: 由于不确定时间序列的长度很长,并且每个采样点的取值具有不确定性,导致了维度灾难和庞大的可能世界集,给不确定时间序列相似性匹配带来了巨大的困难,因此对不确定时间序列降维是实现对其方便存储、快速查询和相似性匹配的首要任务。不确定时间序列普遍采用小波变换的降维方法,但是该方法没有考虑到采样点之间的相关性。为解决该问题,提出一种基于概率统计和数据相关性的降维方法,该方法将不确定时间序列分为概率维度和时间维度,并分别对两维度进行降维。在时间维度,根据采样点之间的相关性,使用某个采样点代表后续相关度高的采样点;在概率维度,使用大概率点表示相邻的小概率点。实验效果表明:使用该方法对不确定时间序列进行降维后,降维序列可以保持原序列的变化趋势,压缩程度显著,并且可近似地恢复原序列。

关键词: 时间序列,不确定性,降维,统计,相关性

Abstract: Due to the length of uncertain time series and the uncertainty of values in each sample point,time complexity is very high when matching two uncertain time series.So the dimension reduction is the primary task to match fast for uncertain series.Now,always taking wavelet transform reduces dimension for uncertain time series,but the method does not consider the correlation between every sample points.We put forward a new method based on statistics and data correlation.It divides uncertain time series to probability dimension and time dimension and performs dimension reduction respectively in the two dimensions.We used a sampling point to represent the subsequent sampling points with high correlation in time dimension,and used large probability point to represent the adjacent small probability points in pro-bability dimension.Experimental results show that the compression ratio is remarkable when using the method to reduce uncertain time series.In addition,it can approximately recover the uncertain time series with reduced outcomes.

Key words: Time series,Uncertainty,Reduction,Statistics,Correlation

[1] Agrawal R,Faloutsos C,Swami A.Efficient Simillarity Search in Sequence Databases[C]∥FODO.1993:69-84
[2] Rafiei D,Mendelzon A O.Querying Time Series Data Based on Sililarity[J].IEEE TKDE,2000,2(5):675-693
[3] Chan K-P,Fu A W-C.Efficient time series matching by wavelets[C]∥Proceedings of the 15th International C on ference on Data Engineering.Sydney,Australia,1999:126-133
[4] Popivanov I,Miller R-J.Similarity Search Over Time-Series Data Using Wavelets[C]∥Proceedings of the 18th International Conference on Data Engineering.San jose,CA,2002:212-221
[5] Chan K-P,Fu A-W,Yu C-T.Haar Wavelets for Efficient Similarity Search of Time-Series:With and Without Time Warping[J].IEEE Transactions on Knowledge and Data Engineering,2003,15(1):686-705
[6] Zhao Yu-chen,Aggarwal C-C,Yu P-S.On wavelet decomposi-tion of uncertain time series data sets[C]∥Proceedings of the 19th ACM Conference on Information and Knowledge Management.Toronto,Canada,2010:129-138
[7] Popivanov I.Similarity search over time-series data using wavelets.Data Engineering[C]∥Proceedings.18th International Conference.2002:212-221
[8] Kawagoe E,Ueda S.On the Need for Time Series Data Mining Benchmarks:A Survey and Empirical Demonstration[J].Data Mining and Knowledge Discovery,2002,7:349-371
[9] Chakrabarti K,Garofalakis M-N,Rastogi R,et al.Approximate query processing using wavelets[J].the VLDB Journal,2001,10(2/3):199-223
[10] Korn F,Jagadish H,Faloutsos C.Efficiently supporting ad hoc queries in large datasets of time sequences[C]∥Joan P,ed.Proceedings of ACM SIGMOD International Conference on Management of Data.Tuescon,AZ USA:Morgan Kaufmann Publishere,1997:286-300
[11] Faloutsos C,Ranganathan M,Manolopoulos Y.Fast Subse-quence Matching in Time Series Databases [C]∥Proceedings of the ACM SIGMOD International Conference on Management of Data.Minneapolis,1994:419-429
[12] Cormode G,Garofalakis M.Histograms and Wavelets on Probabilistic Data[J].IEEE Transactions on Knowledge and Data Engineering,2010,2(8):1142-1157
[13] Liao Kang-li,Chen Hua-hui,Qian Jiang-bo,et al.Wavelet De-composition Algorithm for Uncertain Data Streams[M].2011:965-970
[14] Vidal R,Ma Ya,Sastry S.Generalized principal component analysis (GPCA)[C]∥Proceeding of IEEE Conference on Computer Vision and Pattern Recognition.2003:18-20

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!