计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221100144-8.doi: 10.11896/jsjkx.221100144

• 大数据&数据科学 • 上一篇    下一篇

基于特征再抽象(FRA)的多元时序预测方法

王昊, 周建涛, 郝昕毓, 王飞宇   

  1. 内蒙古大学计算机学院 呼和浩特 010021
    蒙古文智能信息处理技术国家地方联合工程研究中心 呼和浩特 010021
    生态大数据教育部工程研究中心 呼和浩特 010021
    内蒙古自治区云计算与服务软件工程实验室 呼和浩特 010021
    内蒙古自治区社会计算与数据处理重点实验室 呼和浩特 010021
    内蒙古自治区大数据分析技术工程实验室 呼和浩特 010021
    内蒙古自治区纪检监察大数据重点实验室 呼和浩特 010021
    内蒙古自治区大数据分析技术工程实验室 呼和浩特 010021
  • 发布日期:2023-11-09
  • 通讯作者: 周建涛(cszjtao@imu.edu.cn)
  • 作者简介:(2892733460@qq.com)
  • 基金资助:
    国家自然科学基金(62162046);内蒙古科技攻关项目(2021GG0155);内蒙古自然科学基金重大项目(2019ZD15);内蒙古自然科学基金(2019GG372);内蒙古大学/内蒙古自治区研究生科研创新项目(11200-121024)

Multivariate Time Series Forecasting Method Based on FRA

WANG Hao, ZHOU Jiantao, HAO Xinyu, WANG Feiyu   

  1. College of Computer Science,Inner Mongolia University,Hohhot 010021,China
    National & Local Joint Engineering Research Center of Intelligent Information Processing Technology for Mongolian,Hohhot 010021,China
    Engineering Research Center of Ecological Big Data,Ministry of Education,Hohhot 010021,China
    Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software,Hohhot 010021,China
    Inner Mongolia Key Laboratory of Social Computing and Data Processing,Hohhot 010021,China
    Inner Mongolia Engineering Laboratory for Big Data Analysis Technology,Hohhot 010021,China
    Inner Mongolia Key Laboratory of Discipline Inspection and Supervision Big Data,Hohhot 010021,China
    Inner Mongolia Big Data Analysis Technology Engineering Laboratory,Hohhot 010021,China
  • Published:2023-11-09
  • About author:WANG Hao,born in 1998,postgra-duate.His main research interests include big data mining and intelligent analysis technology.
    ZHOU Jiantao,born in 1974,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include cloud computing and software engineering.
  • Supported by:
    National Natural Science Foundation of China(62162046),Inner Mongolia Science and Technology Research Project(2021GG0155),Major Programs of Inner Mongolia Natural Science Foundation(2019ZD15),Inner Mongolia Natural Science Foundation(2019GG372) and Inner Mongolia University/Inner Mongolia Autonomous Region Graduate Scientific Research Innovation Project(11200-121024).

摘要: 科技领域的衍生行业因普遍存在强时间约束的特性而累积了海量的高维时间序列数据,严峻的数据压力导致传统的数据建模预测方法受制于数据规模和属性维度。支撑高质量的服务对大数据智能预测技术提出了更高的要求,如何在数据层面上实现预测性能的提升是现阶段亟待解决的主要问题。针对上述问题,提出了针对多元时序数据的特征再抽象(Feature Re-Abstraction,FRA)算法,首先通过RobustSTL分解算法提取趋势性和季节性特征(Trend and Seasonality Features,TSFs),实现多元数据的特征二阶抽象,以“抽象即特征”替代传统“标签即特征”的提取策略,再通过Pearson相关系数的运算结果评估再抽象技术捕捉的TSFs与目标参数间的相关强度,证实TSF的数据价值。在FRA算法的基础上结合深度学习模型构建基于数据驱动的多元时序预测算法,通过预测效果验证FRA算法的有效性。实验结果表明,引入TSFs作为数据驱动模型的训练向量能够兼具数据降维、降噪及强相关特性地维持,从而避免模型过拟合并缓解模型欠拟合,提高时序预测算法的准确性和鲁棒性。

关键词: 多元时序数据, 多元时序预测算法, 特征再抽象, 趋势性和季节性特征, 相关性评估

Abstract: Derivative industries in the field of science and technology have accumulated a large amount of high-dimensional time series data due to the general existence of strong time constraints.Severe data pressure makes traditional data modeling and prediction methods limited by data scale and attribute dimensions.Services supporting high-quality put forward higher requirements for big data intelligent prediction technology.How to improve the prediction performance at the data level is a main problem that needs to be solved urgently at this stage.Combined with the above problems,a feature re-abstraction(FRA) algorithm for multivariate time series data is proposed.First,the RobustSTL decomposition algorithm is used to extract trend and seasonality features(TSFs),realize the second-order abstraction of features of multivariate data,and replace the traditional extraction strategy of “labels are features” with “abstract is features”.Then,the correlation strength between the TSFs captured by the re-abstract technology and the target parameters is evaluated by the calculation result of the Pearson correlation coefficient,which confirms the data value of the TSF.On the basis of FRA algorithm,combined with deep learning model,a data-driven multivariate time series prediction algorithm is constructed,and the effectiveness of FRA algorithm is verified by the prediction effect.Experimental results show that the introduction of TSFs as the training vector of the data-driven model can maintain the characteristics of data dimensionality reduction,noise reduction and strong correlation,so as to avoid model overfitting and alleviate model underfitting,and improve the accuracy and robustness of time series prediction algorithms.

Key words: Multivariate time series data, Multivariate time series forecasting algorithms, Feature re-abstraction(FRA), Trend and seasonality feature(TSF), Correlation assessment

中图分类号: 

  • TP311.1
[1]REN S G,ZHANG J X,GU X J,et al.Overview of Feature Extraction Algorithms for Time Series[J].Journal of Chinese Computer Systems,2021,42(2):271-278.
[2]ZHAO D F,HUANG Y L,HUANG D M,et al.Research ontime series motif association rule mining method based on AR_TSM[J].Application Research of Computers,2021,38(2):403-408.
[3]YANG H,WANG H Q,CHENG D J.Series Outlier Data Mi-ning Based on Forecastment[J].Computer Science,2004(4):117-119,146.
[4]YE L,KEOGH E.Time series shapelets:a novel technique that allows accurate,interpretable and fast classification[J].Data Mining and Knowledge Discovery,2011,22:149-182.
[5]WAN C,LI W Z,DING W X,et al.A Multivariate Time Series Forecasting Algorithm Based on Self-Evolution and Pre-training[J].Chinese Journal of Computers,2022,45(3):513-525.
[6]JIA J,HU X S,DENG Z W,et al.Data-driven Comprehensive Evaluation of Lithium-ion Battery State of Health and Abnormal Battery Screening[J].Journal of Mechanical Engineering,2021(3):87-97,57.
[7]SHI X,CHEN Z,WANG H,et al.Convolutional LSTM Net-work:A Machine Learning Approach for Precipitation Nowcas-ting[J].arXiv:1506.04214,2015.
[8]VASWANI A.Attention is All You Need[J].arXiv:1706.03762,2017.
[9]ELHASSAN T A M,RAHIM M S M,SWEE T T,et al.eature Extraction of White Blood Cells Using CMYK-Moment Localization and Deep Learning in Acute Myeloid Leukemia Blood Smear Microscopic Images[C]//IEEE Access.2022:16577-16591.
[10]LIU L,ZHU J C,HAN G J,et al.Bearing health monitoring and fault diagnosis based on joint feature extraction in one-dimensional convolution neural network[J].Ruan Jian Xue Bao/Journal of Software,2021,32(8):2379-2390.
[11]MA C C,DU X H,CAO L F,et al.Burst-Analysis Website Fingerprinting Attack Based on Deep Neural Network[J].Journal of Computer Research and Development,2020,57(4):746-766.
[12]ZOU X Y.TIme series prediction algorithm based on graph laplace transform and extreme learning machine[J].Computer Applications and Software,2021,38(4):288-294.
[13]GUO Y H,LU J Y,HUANG C H,et al.Mesh Texture Smoothing Based on Hybrid Spectral Encoding[J].Chinese Journal of Computers,2021,44(2):318-333.
[14]JIA Z Y,LIN Y F,LIU T H,et al.Motor Imagery Classification Based on Multiscale Feature Extraction and Squeeze-Excitation Model[J].Journal of Computer Research and Development,2020,57(12):2481-2489.
[15]LIU Y Y,LI J P,BAI H F,et al.Trend feature extraction me-thod for time series based on turning point and trend segment[J].Journal of Computer Applications,2020,40( S1):92-97.
[16]ZHOU Q,WU T J.Trend feature extraction method based onimportant points in time series[J].Journal of Zhejiang University(Engineering Science),2007(11):1782-1787.
[17]WIJSEN J.Trends in databases:reasoning and mining[J].IEEE Transactions on Knowledge and Data Engineering,2001,13(3):426-438.
[18]WEN Q,GAO J,SONG X,et al.RobustSTL:A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:5409-5416.
[19]CLEVELAND R B,CLEVELAND W S.STL:A seasonal-trend decomposition procedure based on Loess[J].Journal of official statistics,1990,6(1):3-73.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!