计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 157-162.doi: 10.11896/j.issn.1002-137X.2019.05.024

• 人工智能 • 上一篇    下一篇

基于DTW相似判定的周期性时间序列预测方法

李文海1,2, 程佳宇2, 谢晨阳2   

  1. (软件工程国家重点实验室 武汉430072)1
    (武汉大学计算机学院 武汉430072)2
  • 收稿日期:2018-01-16 修回日期:2018-04-15 发布日期:2019-05-15
  • 作者简介:李文海(1979-),男,博士,副教授,主要研究方向为数据库、并行计算和知识发现,E-mail:lwhaymail@zlcn.com(通信作者);程佳宇(1992-),女,硕士生,主要研究方向为数据挖掘;谢晨阳(1994-),女,硕士生,主要研究方向为文本处理和主题模型。
  • 基金资助:
    国家自然科学基金 (61572373,61472290,60903035),国家重点研发计划(2017YFC08038)资助。

Prediction Method of Cyclic Time Series Based on DTW Similarity

LI Wen-hai1,2, CHENG Jia-yu2, XIE Chen-yang2   

  1. (State Key Laboratory of Software Engineering of China,Wuhan 430072,China)1
    (School of Computer Science,Wuhan University,Wuhan 430072,China)2
  • Received:2018-01-16 Revised:2018-04-15 Published:2019-05-15

摘要: 针对大样本下周期性时间序列预测的问题,文中给出了一种基于DTW距离的相似样本度量方法。首先,给出周期性时间序列预测问题的定义,并基于支持向量回归方法分析大量噪声点对预测误差的影响。然后,通过对时间序列周期分段来构建相似性度量,在给定预测样本容量下确定给定预测条件的相似样本子集。同时,基于误差调谐函数对SVM的核函数进行调整,以进一步提升预测精度。最后,基于常用的周期性时间序列,在预测精度上将所提方法与已有算法进行实验比较,并分析该模型的参数敏感性。实验结果验证了所提方法的有效性。

关键词: DTW, 时间序列, 数据挖掘, 相似性, 预测, 支持向量机

Abstract: This paper presented a DTW distance-based sampling framework to effectively improve the accuracy of cyclic time series prediction in large-scale datasets.It addresses the problem of noisy identification for each given prediction condition,and formalizes the impact of noise with the SVR-based predicting method.On top of the DTW-based similarity measurement,this paper presented an end-to-end identification method to improve the quality of the training set.It also introduced a regularized function in the kernel function of SVR,such that the generalization error can be minimized based on the distances between each training instance and the prediction condition.The experiment conducts a series of widely adopted cyclic time series to evaluate the precision and stability of the proposed method.The results demonstrate that in terms of high-quality training instances and the weighted regularization strategy,the proposed method remarkably outperforms its competitors in most of the datasets.

Key words: Data mining, DTW, Prediction, Similarity, Support vector machine, Time series

中图分类号: 

  • TP391.4
[1]GHAREHBAGHI A,ASK P,BABIC A.A pattern recognition framework for detecting dynamic changes on cyclic time series [J].Pattern Recognition,2015,48(3):696-708.
[2]CHEN T T,LEE S J.A weighted ls-svm based learning system for time series forecasting [J].Information Sciences,2015,299(1):99-116.
[3]KANG P,CHO S.Locally linear reconstruction for instance-based learning [J].Pattern Recognition,2008,41(1):3507-3518.
[4]BOUSQUET O,ELOSSEEFF A.Stability and generalization[J].Journal of Machine Learning Research,2001,3(2):499-526.
[5]PETITJEAN F,KETTERLIN A,GANCARSKI P.A global ave-raging method for dynamic time warping,with applications toclustering [J].Pattern Recognition,2011,44(1):678-693.
[6]VIKJORD V V,JESSEN R.Information theoretic clusteringusing a k-nearest neighbors approach [J].Pattern Recognition,2014,47(9):3070-3081.
[7]ROJAS I,VALENZUELA O,ROJAS F,et al.Soft-computingtechniques and arma model for time series prediction [J].Neurocomputing,2008,71(4):519-537.
[8]SUYKENS J A K,BRABANTER J D,LUKAS L,et al.Weighted least squares support vector machines:robustness andsparse approximation [J].Neurocomputing,2002,48(1-4):85-105.
[9]XU H.Robustness and regularization of support vector ma-chines [J].Journal of Machine Learning Research,2009,10(3):1485-1510.
[10]SANTOS J D A,BARRETO G A.A regularized estimationframework for online sparse lssvr models [J].Neurocomputing,2017,238(1):1-12.
[11]CLÁUDIO R D S,SOARES C,KNOBBE A.Entropy-based discretization methods for ranking data [J].Information Sciences,2016,329(1):921-936.
[12]PARK H J,PARK W J,JUNG T,et al.On general purpose time series similarity measures andtheir use as kernel functions in support vector machines [J].Information Sciences,2014,281(4):478-495.
[13]WU H H,LIU G H,WANG W.The problem of similaritymatching for uncertain time series[J].Computer Research and Development,2014,51(8):1802-1810.(in Chinese)吴红花,刘国华,王伟.不确定时间序列的相似性匹配问题[J].计算机研究与发展,2014,51(8):1802-1810.
[14]HAN Z M.Research on Effective Clustering Algorithm for Hot Topic Time Series[J].Journal of Computer,2012,35(11):2337-2347.(in Chinese)韩忠明.面向热点话题时间序列的有效聚类算法研究[J].计算机学报,2012,35(11):2337-2347.
[15]YUAN J D,WNAG Z H.Time Series Representation in Classification Algorithms[J].Computer Science,2015,42(3):1-7.(in Chinese)原继东,王志海.时间序列的表示于分类算法综述[J].计算机科学,2015,42(3):1-7.
[16]HAN M,XU M L,MU D Y.Application of Non-nuclear Correlation Vector Machine in Time Series Prediction[J].Journal of Computer,2014,37(12):2427-2432.(in Chinese)韩敏,徐美玲,穆大芸.无核相关向量机在时间序列预测中的应用[J].计算机学报,2014,37(12):2427-2432.
[17]CHEN Y,SHI Z H.Mixed Financial Time Series ForecastingModel Based on Adaboost and Regularized ELM and Its Application[J].Mathematical Statistics and Management,2017,36(1):112-124.(in Chinese)陈艳,石智慧.基于Adaboost和正则化ELM的混合金融时间序列预测模型及其应用[J].数理统计与管理,2017,36(1):112-124.
[18]HAO Y H,ZHANG H F.Incremental Learning Method Based on Double Support Vector Regression Machine[J].Computer Science,2016,43(2):230-235.(in Chinese)郝运河,张浩峰.基于双支持向量回归机的增量学习方法[J].计算机科学,2016,43(2):230-235.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[3] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[4] 黎嵘繁, 钟婷, 吴劲, 周帆, 匡平.
基于时空注意力克里金的边坡形变数据插值方法
Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation
计算机科学, 2022, 49(8): 33-39. https://doi.org/10.11896/jsjkx.210600161
[5] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[6] 王润安, 邹兆年.
基于物理操作级模型的查询执行时间预测方法
Query Performance Prediction Based on Physical Operation-level Models
计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 帅剑波, 王金策, 黄飞虎, 彭舰.
基于神经架构搜索的点击率预测模型
Click-Through Rate Prediction Model Based on Neural Architecture Search
计算机科学, 2022, 49(7): 10-17. https://doi.org/10.11896/jsjkx.210600009
[9] 高振卓, 王志海, 刘海洋.
嵌入典型时间序列特征的随机Shapelet森林算法
Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features
计算机科学, 2022, 49(7): 40-49. https://doi.org/10.11896/jsjkx.210700226
[10] 张洪博, 董力嘉, 潘玉彪, 萧宗志, 张惠臻, 杜吉祥.
视频理解中的动作质量评估方法综述
Survey on Action Quality Assessment Methods in Video Understanding
计算机科学, 2022, 49(7): 79-88. https://doi.org/10.11896/jsjkx.210600028
[11] 杨啸, 王翔坤, 胡浩, 朱敏.
面向设备状态监测的可视化技术综述
Survey on Visualization Technology for Equipment Condition Monitoring
计算机科学, 2022, 49(7): 89-99. https://doi.org/10.11896/jsjkx.210900167
[12] 赵冬梅, 吴亚星, 张红斌.
基于IPSO-BiLSTM的网络安全态势预测
Network Security Situation Prediction Based on IPSO-BiLSTM
计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103
[13] 朱旭辉, 沈国娇, 夏平凡, 倪志伟.
基于螺旋进化萤火虫算法和BP神经网络的模型及其在PPP融资风险预测中的应用
Model Based on Spirally Evolution Glowworm Swarm Optimization and Back Propagation Neural Network and Its Application in PPP Financing Risk Prediction
计算机科学, 2022, 49(6A): 667-674. https://doi.org/10.11896/jsjkx.210800088
[14] 侯夏晔, 陈海燕, 张兵, 袁立罡, 贾亦真.
一种基于支持向量机的主动度量学习算法
Active Metric Learning Based on Support Vector Machines
计算机科学, 2022, 49(6A): 113-118. https://doi.org/10.11896/jsjkx.210500034
[15] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!