计算机科学 ›› 2018, Vol. 45 ›› Issue (6): 222-227.doi: 10.11896/j.issn.1002-137X.2018.06.040

• 人工智能 • 上一篇    下一篇

基于梯度提升回归树的短时交通流预测模型

沈夏炯1,2, 张俊涛2, 韩道军1,2   

  1. 河南大学数据与知识工程研究所 河南开封 4750041;河南大学计算机与信息工程学院 河南开封 4750042
  • 收稿日期:2017-04-12 出版日期:2018-06-15 发布日期:2018-07-24
  • 作者简介:沈夏炯(1963-),男,博士,教授,CCF会员,主要研究领域为空间数据处理、形式概念分析,E-mail:77230497@qq.com;张俊涛(1989-),男,硕士生,主要研究领域为空间数据处理、数据挖掘与分析,E-mail:zhangjuntao1990@126.com(通信作者);韩道军(1979-),男,博士,副教授,CCF会员,主要研究领域为空间数据处理、信息安全
  • 基金资助:
    本文受国家自然科学基金资助项目(61272545,61402149),河南省科技攻关计划基金资助项目(142102210390),河南省教育厅科技攻关计划基金资助项目(14A520026),河南省博士后科研项目(2015036)资助

Short-term Traffic Flow Prediction Model Based on Gradient Boosting Regression Tree

SHEN Xia-jiong1,2, ZHANG Jun-tao2, HAN Dao-jun1,2   

  1. InstituteofDataandKnowledgeEngineering,HenanUniversity,Kaifeng,Henan 475004,China1;
    SchoolofComputerandInformationEngineering,HenanUniversity,Kaifeng,Henan 475004,China2
  • Received:2017-04-12 Online:2018-06-15 Published:2018-07-24

摘要: 短时交通流预测是交通流建模的一个重要组成部分,在城市道路交通的管理和控制中起着重要的作用。然而,常见的时间序列模型(如ARIMA)、随机森林(RF)模型在交通流预测方面由于被构建模型产生的残差和输入变量所影响,其预测精度受到限制。针对该问题,提出了一种基于梯度提升回归树的短时交通预测模型来预测交通速度。首先,模型引入Huber损失函数作为模型残差的处理方法;其次,在输入变量中考虑预测断面受到毗邻空间因素和时间因素相关性的影响。模型在训练过程中通过不断调整弱学习器的权重来纠正模型的残差,从而提高模型预测的精度。利用某城市快速路的交通速度数据进行实验,并使用MSE和MAPE等指标将本文模型与ARIMA模型和随机森林模型进行对比,结果表明,文中所提模型的预测精度最好,从而验证了模型在短时交通流预测方面的有效性。

关键词: 短时交通流预测, 时空相关性, 损失函数, 梯度提升回归树

Abstract: Short-term traffic flow prediction is an important part of traffic flow modeling,and it also plays an important role in urban road traffic management and control.However,the common time series model (e.g.,ARIMA) and random forest model (RF) are limited in the prediction accuracy due to the residuals generated by the model and the input variables.Aiming at this problem,a short-term traffic flow prediction model based on gradient boosting regression tree(GBRT) was proposed to predict the travel speed.The model (GBRT) first introduces the Huber loss functionto deal with residuals.Secondly,the spatial-temporal correlations are also considered in the input variables.The model adjusts the weight of the weak learners in the training process,and corrects the residuals of the model to improve the prediction accuracy.Experiment was conducted by using traffic speed data of a city expressway,and ARIMA model and random forest modle were compared with the proposed model by using MSE,MAPE and other indicators.Results show that the proposed model has the best prediction accuracy,and the validity of the model in short-term traffic flow prediction is verified.

Key words: Gradient boosting regression tree, Loss function, Short-term traffic flow prediction, Spatial-temporal corre-lations

中图分类号: 

  • TP181
[1]YAO B,CHEN C,CAO Q,et al.Short-Term Traffic Speed Prediction for an Urban Corridor[J].Computer-Aided Civil and Infrastructure Engineering,2017,32(2):154-169.
[2]WANG J,SHI Q X.Summary of short-term traffic flow prediction model[J].Its Communication,2005,1(1):10-13.(in Chi-nese)
王进,史其信.短时交通流预测模型综述[J].Its通讯,2005,1(1):10-13.
[3]ZHANG Y,ZHANG Y,HAGHANI A.A hybrid short-term traffic flow forecasting method based on spectral analysis and statistical volatilitymodel[J].Transportation Research Part C:Emerging Technologies,2014,43(1):65-78.
[4]ZHANG J,LIU X M,HE Y L,et al.Application of ARIMA Model in Forecasting Traffic Accidents[J].Journal of Beijing University of Technology,2007,33(12):1295-1299.(in Chinese)
张杰,刘小明,贺玉龙,等.ARIMA模型在交通事故预测中的应用[J].北京工业大学学报,2007,33(12):1295-1299.
[5]CHENG Z,CHEN X F.The model of short term traffic flow prediction based on the random forest[J].Microcomputer & Its Applications,2016,35(10):46-49.(in Chinese)
程政,陈贤富.基于随机森林模型的短时交通流预测方法[J].微型机与应用,2016,35(10):46-49.
[6]ZHANG Y,HAGHANI A.A gradient boosting method to improve travel time prediction[J].Transportation Research Part C Emerging Technologies,2015,58(3):308-324.
[7]ZHANG F,ZHU X,HU T,et al.Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations[J].ISPRS International Journal of Geo-Information,2016,5(11):201-204.
[8]DING C,WANG D,MA X,et al.Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees[J].Sustainability,2016,8(11):1100-1115.
[9]BREIMAN L.Arcing the Edge:Technical Report 486[R].Berkeley:University of California,CA,USA,1997.
[10]Gradient boosting[EB/OL].https://en.wikipedia.org/wiki/Gra-dient_boosting.
[11]FRIEDMAN J H.Greedy Function Approximation:A Gradient Boosting Machine[J].Annals of Statistics,2000,29(5):1189-1232.
[12]FRIEDMAN J H.Stochastic gradient boosting[J].Computational Statistics & Data Analysis,2002,38(4):367-378.
[13]SHEN D M,QIAO D X,XU K,et al.Gradient Boosting Regression Tree Algorithm and Application of E-commerce BrandRe-commendation[J].Computer Systems & Applications,2015,24(6):114-120.(in Chinese)
申端明,乔德新,许琨,等.梯度渐进回归树算法在电子商务品牌推荐中的应用[J].计算机系统应用,2015,24(6):114-120.
[14]BREIMAN L.Random Forests[J].Machine Learning,2001,45(1):5-32.
[15]周英,卓金武,卞月青.大数据挖掘:系统方法与实例分析[M].北京:机械工业出版社,2016:25-260
[16]Huber loss[EB/OL].https://en.wikipedia.org/wiki/Huber_loss.
[1] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[2] 高荣华, 白强, 王荣, 吴华瑞, 孙想.
改进注意力机制的多叉树网络多作物早期病害识别方法
Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism
计算机科学, 2022, 49(6A): 363-369. https://doi.org/10.11896/jsjkx.210500044
[3] 黄颖琦, 陈红梅.
基于代价敏感卷积神经网络的非平衡问题混合方法
Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification
计算机科学, 2021, 48(9): 77-85. https://doi.org/10.11896/jsjkx.200900013
[4] 张晓宇, 王彬, 安卫超, 阎婷, 相洁.
基于融合损失函数的3D U-Net++脑胶质瘤分割网络
Glioma Segmentation Network Based on 3D U-Net++ with Fusion Loss Function
计算机科学, 2021, 48(9): 187-193. https://doi.org/10.11896/jsjkx.200800099
[5] 冯姣, 陆昶谕.
基于残差注意力网络的跨媒体检索方法
Cross Media Retrieval Method Based on Residual Attention Network
计算机科学, 2021, 48(6A): 122-126. https://doi.org/10.11896/jsjkx.201100026
[6] 段菲, 王慧敏, 张超.
面向数据表示的Cauchy非负矩阵分解
Cauchy Non-negative Matrix Factorization for Data Representation
计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195
[7] 石先让, 宋廷伦, 唐得志, 戴振泳.
一种新颖的单目视觉深度学习算法:H_SFPN
Novel Deep Learning Algorithm for Monocular Vision:H_SFPN
计算机科学, 2021, 48(4): 130-137. https://doi.org/10.11896/jsjkx.200400090
[8] 曲浩, 崔超然, 王萧萧, 苏雅茜, 韩晓晖, 尹义龙.
基于非均衡数据层次学习的案件案由预测方法
Hierarchical Learning on Unbalanced Data for Predicting Cause of Action
计算机科学, 2021, 48(12): 337-342. https://doi.org/10.11896/jsjkx.201100212
[9] 穆逢君, 邱静, 陈路锋, 黄瑞, 周林, 于功敬.
面向人机协同的物体姿态估计帧间稳定性优化方法
Optimization Method for Inter-frame Stability of Object Pose Estimation for Human-Machine Collaboration
计算机科学, 2021, 48(11): 226-233. https://doi.org/10.11896/jsjkx.201200095
[10] 孟丽莎, 任坤, 范春奇, 黄泷.
基于密集卷积生成对抗网络的图像修复
Dense Convolution Generative Adversarial Networks Based Image Inpainting
计算机科学, 2020, 47(8): 202-207. https://doi.org/10.11896/jsjkx.190700017
[11] 景雨, 祁瑞华, 刘建鑫, 刘朝霞.
基于改进多尺度深度卷积网络的手势识别算法
Gesture Recognition Algorithm Based on Improved Multiscale Deep Convolutional Neural Network
计算机科学, 2020, 47(6): 180-183. https://doi.org/10.11896/jsjkx.200200030
[12] 王立华,杜明辉,梁亚玲.
基于角度特征的分类网络
Classification Net Based on Angular Feature
计算机科学, 2020, 47(2): 83-87. https://doi.org/10.11896/jsjkx.190500077
[13] 王丽星, 曹付元.
基于Huber损失的非负矩阵分解算法
Huber Loss Based Nonnegative Matrix Factorization Algorithm
计算机科学, 2020, 47(11): 80-87. https://doi.org/10.11896/jsjkx.190900144
[14] 朱维军, 王鑫, 钟英辉, 樊永文, 陈永华.
一种基于梯度提升回归树的系外行星宜居性预测方法
Habitability Prediction of Exoplanets Based on GBRT Algorithm
计算机科学, 2019, 46(6A): 71-73.
[15] 鲁淑霞, 蔡莲香, 张罗幻.
基于零阶减小方差方法的鲁棒支持向量机
Robust SVM Based on Zeroth Order Variance Reduction
计算机科学, 2019, 46(11): 193-201. https://doi.org/10.11896/jsjkx.181001840
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!