计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 180-186.doi: 10.11896/jsjkx.250100113

• 数据库&大数据&数据科学 • 上一篇    下一篇

融合多尺度特征和注意力机制的时间序列预测模型

潘建1,2, 汪绪豪2   

  1. 1 浙江工业大学之江学院 浙江 绍兴 312030
    2 浙江工业大学计算机科学与技术学院 杭州 310023
  • 收稿日期:2025-01-17 修回日期:2025-05-06 发布日期:2026-02-10
  • 通讯作者: 潘建(pj@zjut.edu.cn)
  • 基金资助:
    浙江省自然科学基金探索项目(LGF20F020015)

Time Series Forecasting Model Integrating Multi-scale Features and Attention Mechanism

PAN Jian1,2, WANG Xuhao2   

  1. 1 Zhijiang College of Zhejiang University of Technology,Shaoxing,Zhejiang 312030,China
    2 College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
  • Received:2025-01-17 Revised:2025-05-06 Online:2026-02-10
  • About author:PAN Jian,born in 1976,Ph.D,associate professor,master’s supervisor,is a member of CCF(No.26947M).His main research interests include natural language processing,intelligent information processing and Internet of Things.
  • Supported by:
    Natural Science Foundation of Zhejiang Province,China(LGF20F020015).

摘要: 目前,在时间序列预测任务的研究中,基于Transformer的模型主要关注从时序数据中提取全局性和局部性特征,并通过改进注意力机制以降低模型的复杂度。然而,现有方法往往忽略了时间序列在多个尺度上展现出的不同粒度特征。针对这一问题,提出了一种融合多尺度特征和注意力机制的时间序列预测模型——MTSformer。首先,通过对原始序列进行下采样,得到多个尺度的子序列,使模型能够融合多个尺度的特征信息,从而增强泛化能力;其次,使用多预测头代替传统的解码器,在提升预测速度的同时降低模型的复杂度;最后,在5个基准数据集上进行了实验,结果显示,与现有的方法相比,MTSformer模型在时间序列预测上的MSE平均降低了24.51%,MAE平均降低了17.84%。

关键词: 时间序列预测, 多尺度特征, Transformer, 多预测头, 下采样

Abstract: Currently,in the research of time series forecasting tasks,Transformer-based models primarily focus on extracting global and local features from time series data and improving attention mechanisms to reduce model complexity.However,exis-ting methods often overlook the different granularity features exhibited by time series at multiple scales.To address this issue,this paper proposes a time series forecasting model that integrates multi-scale features and the attention mechanism,called MTSformer.Firstly,by down-sampling the original sequence,multiple scale subsequences are obtained,enabling the model to integrate multi-scale feature information and enhance generalization ability.Then,a multi-prediction head structure is used to replace the traditional decoder,which improves prediction speed while reducing model complexity.Finally,experiments are conducted on five benchmark datasets,and the results show that compared with existing methods,the MTSformer model achieves ave-rage reductions of 24.51% in MSE and 17.84% in MAE for time series forecasting.

Key words: Time series forecasting, Multi-scale features, Transformer, Multi-head prediction, Downsampling

中图分类号: 

  • TP391
[1]MAO Y H,SUN C C,XU L Y,et al.A Review of Time Series Forecasting Methods Based on Deep Learning[J].Microelectronics & Computer,2023,40(4):8-17.
[2]CHENG W H,CHE W G.Research on Financial Time SeriesForecasting Algorithm Based on Secondary Decomposition and LSTM[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2022,34(4):638-645.
[3]KAN G Y,YANG J.Research on Meteorological Time Series Forecasting Based on Wavelet Transform and LSTM Hybrid Model[J].Computer Science and Application,2022,12(3):682-689.
[4]XU S,LIU D D.Power Load Forecasting Based on Time Series Combination Model[J].Electronic Design Engineering,2023,31(23):1-6.
[5]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st InternationalConfe-rence on Neural Information Processing Systems.2017:6000-6010.
[6]ZAREMBA W,ILYA S,ORIOL V.Recurrent neural network regularization[J].arXiv:1409.2329,2014.
[7]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evalu-ation of gated recurrent neural networks on sequence modeling[J].arXiv:1412.3555,2014.
[8]HOCHREITER S,SCHMIDHUBER J.Long short-termmemory[J].Neural Computation,1997,9(8):1735-1780.
[9]LEA C,FLYNN M D,VIDAL R,et al.Temporal convolutional networks for action segmentation and detection[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:156-165.
[10]WU H,HU T,LIU Y,et al.Timesnet:Temporal 2d-variation modeling for general time series analysis[C]//The Eleventh International Conference on Learning Representations.2022.
[11]ZHOU H,ZHANG S,PENG J,et al.Informer:Beyond efficient transformer for long sequence time-series forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:11106-11115.
[12]WU H,XU J,WANG J,et al.Autoformer:Decomposition transformers with auto-correlation for long-term series forecasting[J].Advances in Neural Information Processing Systems,2021,34:22419-22430.
[13]KITAEV N,KAISER Ł,LEVSKAYA A.Reformer:The efficient transformer[J].arXiv:2001.04451,2020.
[14]ZHOU T,MA Z,WEN Q,et al.Fedformer:Frequency enhanced decomposed transformer for long-term series forecasting[C]//International Conference on Machine Learning.PMLR,2022:27268-27286.
[15]NIE Y,NGUYEN N H,SINTHONG P,et al.A time series is worth 64 words:Long-term forecasting with transformers[J].arXiv:2211.14730,2022.
[16]CHEN P,ZHANG Y,CHENG Y,et al.Pathformer:Multi-scale transformers with adaptive pathways for time series forecasting[J].arXiv:2402.05956,2024.
[17]ZENG A L,CHEN M X,ZHANG L,et al.Are transformers effective for time series forecasting?[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023.
[18]ASUNCION A,NEWMAN D.UCI machine learning repository[EB/OL].https://github.com/uci-ml-repo.
[19]CHOLLET F.Deep learning with Python[M]//Manning Publications.2021:45-52.
[20]ZHAO L,GKOUNTOUNA O,PFOSER D.Spatial auto-regressive dependency interpretable learning based on spatial topological constraints[J].ACM Transactions on Spatial Algorithms and Systems,2019,5(3):1-28.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!