计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221100186-6.doi: 10.11896/jsjkx.221100186

• 大数据&数据科学 • 上一篇    下一篇

基于核技巧改进的Informer模型的长序列时间序列预测方法

潘立群1, 吴中华1, 洪标2   

  1. 1 上海大学管理学院 上海 200444
    2 上海对外经贸大学大学国际经贸学院 上海 201620
  • 发布日期:2023-11-09
  • 通讯作者: 潘立群(panliqun61@yeah.net)

Prediction Method of Long Series Time Series Based on Improved Informer Model with Kernel Technique

PAN Liqun1, WU Zhonghua1, HONG Biao2   

  1. 1 School of Management,Shanghai University,Shanghai 200444,China
    2 School of International Business and Economics,Shanghai University of International Business and Economic,Shanghai 201620,China
  • Published:2023-11-09
  • About author:PAN Liqun,born in 2000,postgraduate.His main research interests include deep learning and demand forecasting.

摘要: 如今,学者们对长序列时间序列问题的预测主要基于类RNN模型,且其中大部分使用的损失函数是传统的均方误差(MSE)。但类RNN模型在预测任务中存在只能捕捉局部信息且计算开销会随着预测序列的增多迅速提升的问题。不仅如此,MSE损失函数无法捕捉长时间序列数据中普遍存在的非线性问题,且自身还存在对异常值敏感和鲁棒性较低的问题。基于以上背景,提出一种完全基于注意力机制的Informer模型,并在模型中使用基于核技巧改进的Kernal-MSE损失函数代替传统的MSE损失函数来解决长序列时间序列预测的问题。在多变量预测多变量的背景下,以3类数据中的8份数据集为例,对比改进后的Informer模型与经典的Informer模型,类RNN模型中的LSTM和GRU模型。结果表明,改进后的Informer模型预测精度更高,且精度的相对提升值随着原始数据量的增大而增大,适用于长序列时间序列预测问题。

关键词: Informer模型, 损失函数, 核技巧, 长序列时间序列预测

Abstract: Nowadays,the prediction of long sequence time series problems is mainly based on RNN like models,and most of the loss functions used are mean square error(MSE).However,MSE loss function can not capture the nonlinear problems commonly existing in long time series data.Moreover,MSE loss function itself is sensitive to outliers and has low robustness.Therefore,this paper proposes to use the improved Kernel MSE loss function based on kernel technique to replace the traditional MSE loss function in Informer model,and solve the nonlinearity in data by mapping the error from the original feature space to a higher dimensional space.Moreover,the first and second derivatives of the new loss function ensure robustness to outliers.Under the background of multivariable prediction and multivariable,this paper compares the prediction accuracy with the classical Informer model,LSTM model and GRU model,taking eight data sets in three types of data as examples.The results show that the improved Informer model has higher prediction accuracy,and the relative improvement value of accuracy increases with the increase of the original data volume,which is suitable for the prediction of long series time series.

Key words: Informer, Loss function, Kernal trick, Long series time series prediction

中图分类号: 

  • TP181
[1]ZHOU H,ZHANG S,PENG J,et al.Informer:Beyond efficient transformer for long sequence time-series forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI Press,2021:11106-11115.
[2]ZHOU Z T,LIU L,SONG X,et al.Remaining useful life prediction method of rolling bearing based on Transformer model[J/OL].[2021-08-14].https://doi.org/10.13700/j.bh.1001-5965.2021.0247.
[3]LI Y,LIN Y,XIAO T,et al.An efficient transformer decoder with compressed sub-layers[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI Press,2021:13315-13323.
[4]WU H,MENG K,FAN D,et al.Multistep short-term windspeed forecasting using transformer[J].Energy,2022,261:125231.
[5]ZHOU C H,LIN P Q.Traffic flow prediction method based on multi-channel Transformer[J/OL].[2022-10-28].https://doi.org/10.19734/j.issn.1001-3695.2022.06.0 306.
[6]NEYSHABUR B,BHOJANAPALLI S,MCAL-LESTER D,et al.Exploring generalizationin deep learning[C]//31st Confe-rence on Neural Information Processing Systems.Curran Asso-ciates,Inc.,2017:5949-5958.
[7]CHEN L,QU H,ZHAO J.Generalized correntropy induced loss function for deep learning[C]//2016 International Joint Confe-rence on Neural Networks.IEEE,2016:1428-1433.
[8]LAI G,CHANG W C,YANG Y,et al.Modeling long-and short-term temporal patterns with deep neural networks[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.Association for Computing Machinery,2018:95-104.
[9]BANDARA K,BERGMEIR C,HEWAMA-LAGE H.LSTM-MSNet:Leveraging forecasts on sets of related time series with multiple seasonal patterns[J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(4):1586-1599.
[10]CHEN X,YU R,ULLAH S,et al.A novel loss function of deep learning in wind speed forecasting[J].Energy,2022,238:121808.
[11]HOCHREITER S,SCHMIDHUBERJ.Long short-term memory[J].Neural Computer,1997,9(8):1735e80.
[12]LIU H,MI X W,LI Y F.Wind speed forecasting method based on deep learning strategy using empirical wavelet transform,long short term memory neural network and Elman neural network[J].Energy Convers Manag,2018,156:498e514.
[13]CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Conference Learning Phrase Representations Using RNN Encoder-decoder for Statistical Machine Translation.1724-1734.
[14]LI S,JIN X,XUAN Y,et al.Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting[C]//33rd Conference on Neural Information Processing Systems.Curran Associates,Inc.,2019:5244-5254.
[15]LI H A,ZHOU X F,FANG L S,et al.Multivariable time series prediction method based on space-time map convolution network[J].Computer Application Research,2022,39(12):1-7.
[16]WAN C,LI W Z,DING W X,et al.A multivariable time series prediction algorithm based on self evolutionary pre training[J].Journal of Computer Science,2022,45(3):513-525.
[17]LIU H X,XIANG M,ZHOU B T,et al.Power load forecasting for long sequence time-series based on informer[J].Journal of Hubei Minzu University(Natural Science Edition),2021,39(3):326-331.
[18]MA J W,YAN J H,SUN R W,et al.Prediction model of PM2.5 concentration based on LSTM-GCN[J].Environmental Monitoring in China,2022,38(5):153-160.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!