Computer Science ›› 2021, Vol. 48 ›› Issue (6): 71-78.doi: 10.11896/jsjkx.200500044

• Database & Big Data & Data Science • Previous Articles     Next Articles

Improved KNN Time Series Analysis Method

HUANG Ming1,2, SUN Lin-fu1,2, REN Chun-hua1,2 , WU Qi-shi1,3   

  1. 1 School of Information Science and Technology,Southwestern Jiaotong University,Chengdu 611756,China
    2 Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province,Southwestern Jiaotong University,Chengdu 610031,China
    3 Big Data Center,New Jersey Institute of Technology,Newark,State of New Jersey 07102,USA
  • Received:2020-05-12 Revised:2020-08-14 Online:2021-06-15 Published:2021-06-03
  • About author:HUANG Ming,born in 1996,master.His main research interests include machine learning,data mining and time series analysis.(1269662102@qq.com)
    SUN Lin-fu,born in 1963,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include cloud platform technology,manufacturing industry chain collaboration technology and manufacturing industry data mining.
  • Supported by:
    National Key Research and Development Program of China(2017YFB1401400,2017YFB1401401).

Abstract: Recently,with the rise of data mining and machine learning,the research about time series analysis has become more and more abundant.As a classic method of machine learning,KNN(K-Nearest Neighbor)is widely used in various fields of time series analysis due to its simplicity and high prediction accuracy.However,the original KNN algorithm has some limitations in predicting time series.The prediction effect of directly using Euclidean distance as a measure of similarity is not ideal,and it cannot adapt to the prediction of time series with overall trends.This paper proposes an improved KNN algorithm named TSTF-KNN(Time Series Trend Fitting KNN).It improves the effect of KNN similarity measurement by normalizing the feature sequence at each moment,so that it can search for similar feature sequences more effectively.In addition,this paper adds error terms to the prediction result to adjust the prediction result so that it can predict the result more effectively.In order to verify the effectiveness of the method,this paper selects 4 public data sets from the kaggle public data sets,and preprocesses the 4 data sets to obtain 5 time series for the experiment.Then,this paper uses TSTF-KNN,KNN,single-layer LSTM(Long Short-Term Memory) neural network and ANN(Artificial Neural Network) to perform prediction experiments on 5 processed time series,analyze the prediction results,and compare the mean square error(MSE),which verifies the effectiveness of this method.Experimental results show that this method can effectively improve the accuracy and the stability of the KNN regression method for time series prediction,so that it can better adapt to the prediction scenarios of time series with overall trend changes.

Key words: Error terms, KNN, Prediction, Similarity measure, Time series analysis

CLC Number: 

  • TP391
[1]DU Y L.Application and analysis of forecasting stock price index based on combination of ARIMA model and BP neural network [C]//2018 Chinese Control and Decision Conference(CCDC).2018:2854-2857.
[2]BABU C N,REDDY B E.Predictive data mining on AverageGlobal Temperature using variants of ARIMA models [C]//IEEE-International Conference on Advances in Engineering,Science And Management(ICAESM-2012).2012:256-260.
[3]LIU H,LI C X,SHAO Y Q,et al.Forecast of the trend in incidence of acute hemorrhagic conjunctivitis in China from 2011-2019 using the Seasonal Autoregressive Integrated Moving Average(SARIMA) and Exponential Smoothing(ETS) models [J].Journal of Infection and Public Health,2020,13(2):287-294.
[4]ARAUJO C A G,CARVALHO F A T,MAIA A L S.Exponential smoothing methods for forecasting bar diagram-valued time series[C]//2012 IEEE International Conference on Systems,Man,and Cybernetics(SMC).2012:1361-1366.
[5]BOX G E P,JENKINS G M,REINSEL G C,et al.Time Series Analysis:Forecasting and Control [M].Hoboken:John Wiley & Sons,2008:103-113.
[6]SUNORI S K,JUNEJA P K,CHATURVEDI M,et al.ANN Modeling for Predicting Time Series[C]//2018 International Conference on Advances in Computing,Communication Control and Networking(ICACCCN).2018:792-794.
[7]YADAV A,JHA C K,SHARAN A.Optimizing LSTM for time series prediction in Indian stock market [J].Procedia Computer Science,2020,167:2091-2100.
[8]YANG Y J,YANG Y M,LI J P.Research on financial time series forecasting based on SVM[C]//2016 13th International Computer Conference on Wavelet Active Media Technology and Information Processing(ICCWAMTIP).2016:346-349.
[9]FERNANDO F R,SIMON S R,JULIAN A F.Exchange-rateforecasts with simultaneous nearest-neighbour methods:evidence from the EMS[J].International Journal of Forecasting,1999,15(4):383-392.
[10]HARRINGTON P.Machine Learning in Action [M].Greenwich:Manning Publications,2012:15-16.
[11]ALTMAN N S.An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression [J].Taylor & Francis Group,2012,46(3):175-185.
[12]ALPER T,GOZDE U.A RNN based time series approach for forecasting turkish electricity load[C]//2018 26th Signal Processing and Communications Applications Conference(SIU).2018:1-4.
[13]LUDWIG S A.Comparison of Time Series Approaches applied to Greenhouse Gas Analysis:ANFIS,RNN,and LSTM[C]//2019 IEEE International Conference on Fuzzy Systems(FUZZ-IEEE).2019:1-6.
[14]OLIVEIRA J F L,LUDERMIR T B.A hybrid evolutionary decomposition system for time series forecasting [J].Neurocomputing,2016,180:27-34.
[15]HUANG H Y,LIU W X,DING Z H.Sales Forecasting Based on Multi-dimensional Grey Model and Neural Network [J].Journal of Software,2019,30(4):1031-1045.
[16]ZHOU F.Marginal Electricity Price Forecasting Based on KNN-ANN Algorithm [J].Computer Engineering,2010,36(11):188-194.
[17]HUANG N E,SHEN Z,LONG S R,et al.The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J].Proceedings of the Royal Society of London.Series A:Mathematical,Physical and Enginee-ring Sciences,1998,454(1971):903-995.
[18]LV P,YUE L.Short-term wind speed forecasting based on non-stationary time series analysis and ARCH model [C]//2011 International Conference on Multimedia Technology.2011:2549-2553.
[19]ABALOV N V,GUBAREV V V.Identification of time series based on methods of singular spectrum analysis and modeleteka [C]//2014 12th International Conference on Actual Problems of Electronics Instrument Engineering(APEIE).2014:643-647.
[20]RAKTHANMANON T,CAMPANA B,MUEEN A,et al.Searching and mining trillions of time series subsequences under dynamic time warping [C]//International Conference on Knowledge Discovery and Data Mining.2012:262-270.
[21]LI W H,CHENG J Y,XIE C Y.Prediction Method of Cyclic Time Series Based on DTW Similarity [J].Computer Science,2019,46(5):157-162.
[22]YANG F Y,WANG B Y,CHEN Y,et al.K-nearest neighbor urban forecasting algorithm considering wind factors [J].Application Search of Computers,2019,36(6):1679-1682,1722.
[23]LORA A T,SANTOS J M R,EXPOSITO A G,et al.Electricity Market Price Forecasting Based on Weighted Nearest Neighbors Techniques [J].IEEE Transactions on Power Systems,2007,22(3):1294-1301.
[24]SUN B,MA L,CHENG W,et al.An improved k-nearest neighbours method for traffic time series imputation [C]//2017 Chinese Automation Congress(CAC).2017:7346-7351.
[25]PARMEZAN A R S,BATISTA G E A P A.A Study of the Use of Complexity Measures in the Similarity Search Process Adop-ted by kNN Algorithm for Time Series Prediction [C]//2015 IEEE 14th International Conference on Machine Learning and Applications(ICMLA).2015:45-51.
[26]FRANCISCO M,MARIA P F,MARIA D P,et al.Dealing with seasonality by narrowing the training set in time series forecasting with kNN [J].Expert Systems with Applications,2018,103:38-48.
[1] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[2] HUANG Li, ZHU Yan, LI Chun-ping. Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(9): 76-82.
[3] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[4] WANG Run-an, ZOU Zhao-nian. Query Performance Prediction Based on Physical Operation-level Models [J]. Computer Science, 2022, 49(8): 49-55.
[5] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[6] SHUAI Jian-bo, WANG Jin-ce, HUANG Fei-hu, PENG Jian. Click-Through Rate Prediction Model Based on Neural Architecture Search [J]. Computer Science, 2022, 49(7): 10-17.
[7] ZHANG Hong-bo, DONG Li-jia, PAN Yu-biao, HSIAO Tsung-chih, ZHANG Hui-zhen, DU Ji-xiang. Survey on Action Quality Assessment Methods in Video Understanding [J]. Computer Science, 2022, 49(7): 79-88.
[8] ZHAO Dong-mei, WU Ya-xing, ZHANG Hong-bin. Network Security Situation Prediction Based on IPSO-BiLSTM [J]. Computer Science, 2022, 49(7): 357-362.
[9] WANG Xin, XIANG Ming-yue, LI Si-ying, ZHAO Ruo-cheng. Relation Prediction for Railway Travelling Group Based on Hidden Markov Model [J]. Computer Science, 2022, 49(6A): 247-255.
[10] LIU Bao-bao, YANG Jing-jing, TAO Lu, WANG He-ying. Study on Prediction of Educational Statistical Data Based on DE-LSTM Model [J]. Computer Science, 2022, 49(6A): 261-266.
[11] ZHU Xu-hui, SHEN Guo-jiao, XIA Ping-fan, NI Zhi-wei. Model Based on Spirally Evolution Glowworm Swarm Optimization and Back Propagation Neural Network and Its Application in PPP Financing Risk Prediction [J]. Computer Science, 2022, 49(6A): 667-674.
[12] WANG Fei, HUANG Tao, YANG Ye. Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion [J]. Computer Science, 2022, 49(6A): 784-789.
[13] CAI Xin-yu, FENG Xiang, YU Hui-qun. Adaptive Weight Based Broad Learning Algorithm for Cascaded Enhanced Nodes [J]. Computer Science, 2022, 49(6): 134-141.
[14] WANG Xue-guang, ZHU Jun-wen, ZHANG Ai-xin. Identification Method of Voiceprint Identity Based on ARIMA Prediction of MFCC Features [J]. Computer Science, 2022, 49(5): 92-97.
[15] GAO Zhi-yu, WANG Tian-jing, WANG Yue, SHEN Hang, BAI Guang-wei. Traffic Prediction Method for 5G Network Based on Generative Adversarial Network [J]. Computer Science, 2022, 49(4): 321-328.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!