计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 444-449.doi: 10.11896/JsJkx.190700158

• 数据库 & 大数据 & 数据科学 • 上一篇    下一篇

基于CEEMD-Pearson和深度LSTM混合模型的PM2.5浓度预测方法

丁子昂, 乐曹伟, 吴玲玲, 付明磊   

  1. 浙江工业大学理学院 杭州 310023
  • 发布日期:2020-07-07
  • 通讯作者: 付明磊(fuml@zJut.edu.cn)
  • 作者简介:785441595@qq.com
  • 基金资助:
    浙江省科技厅“一带一路”专项(2015C04005)

PM2.5 Concentration Prediction Method Based on CEEMD-Pearson and Deep LSTM Hybrid Model

DING Zi-ang, LE Cao-wei, WU Ling-ling and FU Ming-lei   

  1. College of Sciences,ZheJiang University of Technology,Hangzhou 310023,China
  • Published:2020-07-07
  • About author:DING Zi-ang, born in 1995, postgradua-te.His main research interests include data processing and deep learning.
    FU Ming-lei, born in 1981, Ph.D, asso-ciate professor.His main research in-terests include Signal processing, deep learning and intelligent robot.
  • Supported by:
    This work was supported by the Special ProJect of “One Belt and One Road” of ZheJiang Science and Technology Department (2015C04005).

摘要: PM2.5是衡量空气污染物浓度的核心指标。通过挖掘PM2.5历史数据的时序特性,完成对未来PM2.5浓度值的精确预测具有较强的学术意义和应用价值。然而,原始PM2.5浓度值时间序列数据相关性对模型的预测精度产生了较大的影响。为了解决这个问题,文中提出一种基于补充总体经验模态分解-皮尔逊相关分析(CEEMD-Pearson)和深度长短期记忆神经网络(Long Short Term Memory,LSTM)混合模型的PM2.5浓度预测方法。该方法利用补充总体经验模态分解(Complementary Ensemble Empirical Mode Decomposition,CEEMD)对PM2.5浓度历史数据进行不同频率的分解,增强数据中体现的时序特性。然后通过Pearson相关性检验方法对分解后的不同频率子波(IMFs)进行筛选,将筛选后的增强数据输入到多隐含层的深度LSTM网络的输入层进行训练并预测。实验数据表明,CEEMD-LSTM混合模型的预测精度为80%,但是该模型在训练次数为7000次左右才收敛;而经过Pearson二次筛选后的模型在训练800次左右就已经收敛,并且精度提升到87%;CEEMD-Pearson与深度LSTM神经网络混合模型的训练效果最优,在训练650次左右就已经收敛,并且预测精度达到了90%。实验结果说明,CEEMD模态分解方法可以展现出历史数据中的隐藏时序特性,结合Pearson相关性分析进行的二次筛选可有效地提升模型训练的收敛速度和预测精度。因此,基于CEEMD-Pearson和深度LSTM的混合模型可以获得最佳的训练效果、最快的收敛速度以及最精准的预测结果,可以有效解决PM2.5浓度预测问题。

关键词: CEEMD, LSTM, Pearson, PM2.5, 混合模型, 深度神经网络

Abstract: PM2.5 is well-known as the key indicator for measuring the concentration of air pollutants.It is of great significance for both academic study and applications to make accurate prediction of future PM2.5concentration values by excavating the time series characteristics of PM2.5historical data.However,the correlation of time series data of the original PM2.5concentration value has great influence on the prediction accuracy of the model.In order to solve this problem,a PM2.5concentration prediction me-thod based on CEEMD-Pearson and deep LSTM hybrid model was proposed in this paper.The CEEMD modal decomposition me-thod is adopted to decompose the PM2.5concentration historical data at different frequencies,and to enhance the timing characte-ristics of the data.Then,the Pearson correlation test method is used to screen the different frequency IMFs after decomposition,and the filtered enhancement data is input to the input layer of the deep LSTM network of multiple hidden layers for training and prediction.Experimental data shows that the prediction accuracy of the CEEMD-LSTM hybrid mo-del is 80%.However,the model converges after 7000 training times.While by means of the secondary screening of Pearson correlation test,the model converges after 800 training times,and the prediction accuracy is improved to 87%.At last,the hybrid model combines CEEMD-Pearson with deep LSTM neural network has the best training effect.It converges after 650 training times,and the prediction accuracy reaches 90%.Experimental results show that the CEEMD modal decomposition method can show the hidden time series characteristics in historical data.The secondary screening combined with Pearson correlation analysis can effectively improve the convergence speed and prediction accuracy of model training.Therefore,based on the CEEMD-Pearson and deep LSTM hybrid models,the best training result,the fastest convergence speed and the most accurate prediction result can be obtained,which can effectively solve the PM2.5concentration prediction problem.

Key words: CEEMD, Deep neural network, Hybrid model, LSTM, Pearson, PM2.5

中图分类号: 

  • TP183
[1] 陈宁,毛善君,李德龙,等.多基站协同训练神经网络的PM2.5预测模型.测绘科学,2018,43(7):87-93.
[2] PEREZ P,TRIER A,REYES J.Prediction of PM2.5concentrations several hours in advance using neural networks in Santiago,Chile.Atmospheric Environment,2000,34(8):1189-1196.
[3] ORDIERES J B,VERGARA E P,CAPUZ R S,et al.Neural network prediction model for fine particulate matter (PM 2.5) on the US-Mexico border in El Paso (Texas) and Ciudad Juárez (Chihuahua) .Environmental Modelling and Software,2005,20(5):547-559.
[4] LUIS A,ORTEGA J C,FU J S,et al.A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas:The case of Temuco,Chile.Atmospheric Environment,2008,42(35):8331-8340.
[5] AL-ALAWI S M,ABDUL-WAHAB S A,BAKHEIT C S.Combining principal component regression and artificial neural networks for more accurate predictions of ground-level ozone.Environmental Modelling and Software,2008,23(4):396-403.
[6] WANG Z,LU F,HE H,et al.Fine-scale estimation of carbon monoxide and fine particulate matter concentrations in proximity to a road intersection by using wavelet neural network with genetic algorithm.Atmospheric Environment,2015,104:264-272.
[7] FU M,WANG W,LE Z,et al.Prediction of particular matterconcentrations by developed feed-forward neural network with rolling mechanism and gray model.Neural Computing and Applications,2015,26(8):1789-1797.
[8] YETILMEZSOY K,OZKAYA B,CAKMAKCI M.Artificial intelligence-based prediction models for environmental engineering.Neural Network World,2011,21(3):193-218.
[9] DONG M,YANG D,KUANG Y,et al.PM2.5 concentration prediction using hidden semi-Markov model-based times series data mining.Expert Systems with Applications,2009,36(5):9046-9055.
[10] KURT A,OKTAY A B.Forecasting air pollutant indicator levels with geographic models 3 days in advance using neural networks.Expert Systems with Applications,2010,37(12),7986-7992.
[11] GUPTA P,CHRISTOPHER S A.Particulate matter air quality assessment using integrated surface,satellite,and meteorological products:Multiple regression approach.Journal of Geophysical Research Atmospheres,2009,114(14):1-13.
[12] GAN K,SUN S,WANG S,et al.A secondary-decomposition-ensemble learning paradigm for forecasting PM2.5concentration.Atmospheric Pollution Research,2018,9(6):989-999.
[13] ZHU S,LIAN X,LIU H,et al.Daily air quality index forecasting with hybrid models:A case in China.Environmental Pollution,2017,231(Pt 2).
[14] NIU M,WANG Y,SUN S,et al.A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5concentration forecasting.Atmospheric Environment,2016,134:168-180.
[15] LIU X,LIU Q,ZOU Y,et al.A Self-organizing LSTM-Based Approach to PM2.5Forecast//Cloud Computing and Security(ICCCS 2018).Lecture Notes in Computer Science.Springer,Cham,2018.
[16] HUANG C J,KUO P H.A deep cnn-lstm model for particulate matter (Pm2.5) forecasting in smart cities.Sensors,2018,18(7):2220.
[17] LOY-BENITEZ J,VILELA P,LI Q,et al.Sequential prediction of quantitative health risk assessment for the fine particulate matter in an underground facility using deep recurrent neural networks.Ecotoxicology and Environmental Safety,2019,169:316-324.
[18] SOH P W,CHANG J W,HUANG J W.Adaptive Deep Learning-Based Air Quality Prediction Model Using the Most Relevant Spatial-Temporal Relations.IEEE Access,2018,6:38186-38199.
[19] XU Y,YANG W,WANG J.Air quality early-warning system for cities in China.Atmospheric Environment,2017,148:239-257.
[20] 赵雪花,桑宇婷,祝雪萍.基于CEEMD-GRNN组合模型的月径流预测方法.人民长江,2019(4):117-123.
[21] 王礼敏,严倩,李寿山,等.基于双通道LSTM模型的用户性别分类方法研究.计算机科学,2018,45(2):121-124.
[22] 吕永强,闵巍庆,段华,等.融合三元卷积神经网络与关系网络的小样本食品图像识别.计算机科学,2020(1):1-8.
[23] 曾蒸,李莉,陈晶.用于情感分类的双向深度LSTM.计算机科学,2018,45(8):213-217,252.
[1] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[2] 于家畦, 康晓东, 白程程, 刘汉卿.
一种新的中文电子病历文本检索模型
New Text Retrieval Model of Chinese Electronic Medical Records
计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198
[3] 林夕, 陈孜卓, 王中卿.
基于不平衡数据与集成学习的属性级情感分类
Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning
计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205
[4] 王杉, 徐楚怡, 师春香, 张瑛.
基于CNN-LSTM的卫星云图云分类方法研究
Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM
计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177
[5] 焦翔, 魏祥麟, 薛羽, 王超, 段强.
基于深度学习的自动调制识别研究
Automatic Modulation Recognition Based on Deep Learning
计算机科学, 2022, 49(5): 266-278. https://doi.org/10.11896/jsjkx.211000085
[6] 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨.
基于国产众核处理器的深度神经网络算子加速库优化
Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor
计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226
[7] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[8] 范红杰, 李雪冬, 叶松涛.
面向电子病历语义解析的疾病辅助诊断方法
Aided Disease Diagnosis Method for EMR Semantic Analysis
计算机科学, 2022, 49(1): 153-158. https://doi.org/10.11896/jsjkx.201100125
[9] 袁景凌, 丁远远, 盛德明, 李琳.
基于视觉方面注意力的图像文本情感分析模型
Image-Text Sentiment Analysis Model Based on Visual Aspect Attention
计算机科学, 2022, 49(1): 219-224. https://doi.org/10.11896/jsjkx.201000074
[10] 程思伟, 葛唯益, 王羽, 徐建.
BGCN:基于BERT和图卷积网络的触发词检测
BGCN:Trigger Detection Based on BERT and Graph Convolution Network
计算机科学, 2021, 48(7): 292-298. https://doi.org/10.11896/jsjkx.200500133
[11] 胡聿文.
基于优化LSTM模型的股票预测
Stock Forecast Based on Optimized LSTM Model
计算机科学, 2021, 48(6A): 151-157. https://doi.org/10.11896/jsjkx.200400011
[12] 陈慧琴, 郭贯成, 秦朝轩, 李兆碧.
基于GM-LSTM模型的南京市老年人口预测研究
Research on Elderly Population Prediction Based on GM-LSTM Model in Nanjing City
计算机科学, 2021, 48(6A): 231-234. https://doi.org/10.11896/jsjkx.200900142
[13] 周欣, 刘硕迪, 潘薇, 陈媛媛.
自然交通场景中的车辆颜色识别
Vehicle Color Recognition in Natural Traffic Scene
计算机科学, 2021, 48(6A): 15-20. https://doi.org/10.11896/jsjkx.200800078
[14] 俞建业, 戚湧, 王宝茁.
基于Spark的车联网分布式组合深度学习入侵检测方法
Distributed Combination Deep Learning Intrusion Detection Method for Internet of Vehicles Based on Spark
计算机科学, 2021, 48(6A): 518-523. https://doi.org/10.11896/jsjkx.200700129
[15] 张争万, 吴迪, 张春炯.
基于多通道稀疏LSTM的蜂窝流量预测研究
Study of Cellular Traffic Prediction Based on Multi-channel Sparse LSTM
计算机科学, 2021, 48(6): 296-300. https://doi.org/10.11896/jsjkx.210400134
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!