计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 184-189.doi: 10.11896/jsjkx.200600090

• 大数据&数据科学 • 上一篇    下一篇

一种高精度LSTM-FC大气污染物浓度预测模型

刘梦炀1,2, 武利娟1,3, 梁慧1,3, 段旭磊1,3, 刘尚卿1,3, 高一波1,3   

  1. 1 天津中科智能技术研究院 天津300300
    2 上海大学计算机工程与科学学院 上海200444
    3 中国科学院自动化研究所 北京100190
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 武利娟(lijuan.wu@ia.ac.cn)
  • 作者简介:mengyangliu.official@gmail.com
  • 基金资助:
    互联网跨界融合创新科技重大专项;大气污染物监测大数据分析平台(18ZXRHSF00250)

A Kind of High-precision LSTM-FC Atmospheric Contaminant Concentrations Forecasting Model

LIU Meng-yang1,2, WU Li-juan1,3, LIANG Hui1,3, DUAN Xu-lei1,3, LIU Shang-qing1,3, GAO Yi-bo1,3   

  1. 1 Tianjin Intelligent Tech Institute of CASIA Tianjin,Tianjin 300300,China
    2 School of Computer Engineering and Science,Shanghai University,Shanghai 200444,China
    3 Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:LIU Meng-yang,born in 1998,undergraduate.His main rersearch interests include machine learning,deep learning and high performance computing.
    WU Li-juan,born in 1987,postgradua-te,engineer.Her main research inte-rests include intelligent information processing,prediction of air pollutants and big data analysis.
  • Supported by:
    Science and Technology Major Project of Cross Border Integration and Innovation of Internet of Tianjin and Big Data Analysis Platform for Air Pollutant Monitoring(18ZXRHSF00250).

摘要: 大气污染已经严重影响到人们的生活和健康,大气治理势在必行,探究大气污染物浓度变化的规律,实现污染物浓度预测,对指导大气治理工作具有重要意义。文中构建了一种基于长短期记忆神经网络(Long Short-Term Memory,LSTM)和全连接神经网络(Full Connected,FC)的混合神经网络模型,并提出了数据桶划分的训练方式来解决由于训练数据与预测数据存在较长时间间隔导致精度下降的问题,进而实现大气污染物浓度的预测。该模型具有较好的通用性和精度,充分结合了长短期记忆神经网络和全连接神经网络的优点,能够在多种污染物数据上实现精确预测。以天津市2013-2019年大气污染物数据实现模型的训练和预测,结果表明,混合神经网络模型在PM2.5,PM10,NO2,SO2,O3,CO 6种污染物浓度的预测上均可以达到R2>0.90,平均百分误差小于15%的效果,LSTM-FC模型在大气污染物预测中具有明显的优势,具有较高的实用价值。

关键词: 长短期记忆神经网络, 多维度特征融合, 混合神经网络模型, 全连接神经网络, 污染物浓度预测

Abstract: Atmospheric contamination can pose a severe threat to the health of people and incur kinds of diseases,thus,forecasting the concentration of atmospheric contaminant can be of great significance for instructing the atmospheric pollution control.To solve the issue,we propose a kind of mixed forecasting model based on LSTM and full connected neural network,and we introduce the training strategyof data bucket,which can address the issue that the long interval between training data and forecasting sample.Our model has a high performance on both versatility and precision,we fully combine the advantages of LSTM and full connected together and achieve high precision forecasting with varieties of contaminants.Finally,we take an example of forecasting of Tianjin to validate its strength and the results show that our model can achieve R2>0.90,MSE<0.15performance for all six kinds of pollutant.It shows that LSTM-FC Model has its great strength for atmospheric contaminant concentrations task.

Key words: Atmospheric contaminant forecasting, Full connected neural network, Hybrid neural network model, Long short-term memory neural network, Multi-dimension feature fusion

中图分类号: 

  • TPX513
[1] JIANG B.Review of epidemiological evidences for association between air pollution and stroke[J].Chinese General Practice,2018,21(18):2156-2162.
[2] BROOK R D,RAJAGOPALAN S,POPE C A,et al.Particulate matter air pollution and cardiovascular disease an update to the scientific statement from the american heart association [J].Circulation, 2010, 121(21):2331-2378.
[3] LIPSETT M J,OSTRO B D,REYNOLDS P,et al.Long-term exposure to air pollution and cardiorespiratory disease in the California teachers study cohort[J].American Journal of Respiratory and Critical Care Medicine, 2011, 184(7):828-835.
[4] HUANG S X. Air pollution and control:Past, present and future[J].Chinese Science Bulletin,2018,63(10):895-919.
[5] SCHERE K L.U.S.EPA MODELS-3/CMAQ - STATUS AND APPLICATIONS[R].Presented at US/German Ozone/Fine Particle Science and Environmental Chamber Workshop, Riverside, CA, 1999.
[6] BAI S N,SHEN X L. PM2.5 PREDICTION BASED ONLSTM RECURRENT NEURAL NETWORK [J].Computer Applications and Software ,2019,36(1):73-76,110.
[7] HAN W, WU Y L, REN F.The Prediction of Air Pollutants Based on Full Connection and LSTM Neural Network [J].Geomatics World ,2018,25(3):34-40.
[8] MENG Q.Air Quality Classification Prediction Based on Random Forest Model[J].J Chongqing Technol Business Univ.( Nat Sci Ed) 2018,179(3):33-37.
[9] DU X, F J Y, L S Q, S W.PM2.5 concentration prediction model based on random forest regression analysis[J].Telecommunication Science 2017,33(7):66-75.
[10] KULKARNI G E, MULEY A A, DESHMUKH N K, et al.Autoregressive integrated moving average time series model for forecasting air pollution in Nanded city, Maharashtra, India[J].Modeling Earth Systems & Environment, 2018, 4(4):1435-1444.
[11] PENG S J, SHEN J C, ZHU X.Forecast of PM2.5 Based on the ARIMA Model[J].Safety and Environmental Engineering ,2014,21(6):125-128.
[12] HOCHREITER S, JÜRGEN S.Long short-term memory[J].Neural Computation, 1997, 9(8) :1735-1780.
[13] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research, 2014, 15(1):1929-1958.
[14] KINGMA D,BA J.Adam:A Method for Stochastic Optimization[J].Computer Science, 2014.
[15] ZHAI B, CHEN J.Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China[J].Science of the Total Environment, 2018,635:644-658.
[16] GAN K,SUN S,WANG S,et al.A secondary-decomposition-ensemble learning paradigm for forecasting PM2.5,concentration[J].Atmospheric Pollution Research,2018:S1309104217306190.
[17] WANG P,ZHANG H,QIN Z,et al.A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting[J].Atmospheric Pollution Research,2017:S1309104216302616.
[18] NG K Y,AWANG N.Multiple linear regression and regression with time series error models in forecasting PM10 concentrations in Peninsular Malaysia[J].Environmental Monitoring and Assessment,2018,190(2):63.
[1] 魏辉, 陈泽茂, 张立强.
一种基于顺序和频率模式的系统调用轨迹异常检测框架
Anomaly Detection Framework of System Call Trace Based on Sequence and Frequency Patterns
计算机科学, 2022, 49(6): 350-355. https://doi.org/10.11896/jsjkx.210500031
[2] 包振山, 郭俊南, 谢源, 张文博.
基于LSTM-GA的股票价格涨跌预测模型
Model for Stock Price Trend Prediction Based on LSTM and GA
计算机科学, 2020, 47(6A): 467-473. https://doi.org/10.11896/JsJkx.190900128
[3] 刁莉, 王宁.
基于X12-LSTM模型的保费收入预测研究
Research on Premium Income Forecast Based on X12-LSTM Model
计算机科学, 2020, 47(6A): 512-516. https://doi.org/10.11896/JsJkx.191100077
[4] 余珊珊, 苏锦钿, 李鹏飞.
一种基于自注意力的句子情感分类方法
Sentiment Classification Method for Sentences via Self-attention
计算机科学, 2020, 47(4): 204-210. https://doi.org/10.11896/jsjkx.190100097
[5] 杨佳宁, 黄向生, 李宗翰, 荣灿, 刘道伟.
基于双层栈式长短期记忆的电网时空轨迹预测
Spatio-temporal Trajectory Prediction of Power Grid Based on Double Layers Stacked Long Short-term Memory
计算机科学, 2019, 46(11A): 23-27.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!