计算机科学 ›› 2026, Vol. 53 ›› Issue (4): 155-162.doi: 10.11896/jsjkx.250600047

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于预训练时空解耦的交通流预测模型

李静, 杜圣东, 史浩琛, 胡节, 杨燕, 李天瑞   

  1. 西南交通大学计算机与人工智能学院 成都 611756
  • 收稿日期:2025-06-08 修回日期:2025-09-15 出版日期:2026-04-15 发布日期:2026-04-08
  • 通讯作者: 杜圣东(sddu@swjtu.edu.cn)
  • 作者简介:(1322601861@my.swjtu.edu.cn)
  • 基金资助:
    四川省重大科技专项(2024ZDZX0012);国家自然科学基金面上项目(62276215);国家自然科学基金联合基金(U2468207)

Pre-trained Spatio-Temporal Decoupling-based Traffic Flow Prediction Model

LI Jing, DU Shengdong, SHI Haochen, HU Jie, YANG Yan, LI Tianrui   

  1. School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
  • Received:2025-06-08 Revised:2025-09-15 Published:2026-04-15 Online:2026-04-08
  • About author:LI Jing,born in 2000,postgraduate.Her main research interests include artificial intelligence,deep learning and traffic flow prediction.
    DU Shengdong,born in 1981,Ph.D,associate professor,Ph.D supervisor,is a member of CCF(No.73290M).His main research interests include artificial intelligence,machine learning and knowledge engineering.
  • Supported by:
    Major Science and Technology Special Project of Sichuan Province(2024ZDZX0012),General Program of the National Natural Science Foundation of China(62276215) and Joint Fund of the National Natural Science Foundation of China(U2468207).

摘要: 交通流预测作为智慧城市动态决策的核心技术,其准确性是影响交通信号控制、路径规划和应急管理的关键。随着城市路网规模的扩大和交通数据的激增,传统方法难以对路网节点间复杂的时空交互特性进行精准建模。预训练模型虽然能进行跨领域知识迁移,但应用于交通流预测任务时,仍面临时空特征耦合所导致的建模瓶颈,以及预训练表征与交通领域特性不匹配的问题。针对上述问题,提出一种基于预训练时空解耦的交通流预测模型(PT-STD)。该方法通过时空分解模块解耦分离空间拓扑关联与多粒度时序模式的深度特征学习,进一步设计分层自适应微调策略,分阶段解冻预训练模型的归一化层与注意力参数,逐步将预训练模型中学习到的通用知识迁移到时空特性建模中。实验表明,该模型在基准数据集上展现出显著优势,而且在数据稀缺场景下的平均绝对误差可降低3.89%。

关键词: 交通流预测, 时空分解, 分层微调, 预训练模型, 城市计算

Abstract: Traffic flow prediction,as a core technology for dynamic decision-making in smart cities,plays a crucial role in traffic signal control,route planning,and emergency management.With the expansion of urban road networks and the rapid growth of traffic data,traditional methods face challenges in accurately modeling the complex spatio-temporal interactions among road network nodes.Although pre-trained models can transfer knowledge across domains,they still encounter limitations when applied to traffic flow prediction,primarily due to coupled spatio-temporal features and the mismatch between pre-trained representations and traffic-specific characteristics.To address these issues,this paper proposes the pre-trained spatio-temporal decoupling-based traffic flow prediction model(PT-STD).The method employs a spatio-temporal decoupling module to disentangle the deep feature learning of spatial topological relationships and multi-granularity temporal patterns.Furthermore,it designs a hierarchical adaptive fine-tuning strategy that progressively unfreezes the normalization layers and attention parameters of the pre-trained model,gradually transferring the general knowledge learned in the pre-trained model to spatio-temporal feature modeling.Experimental results demonstrate that PT-STD achieves significant improvements on standard benchmark datasets,with a 3.89% reduction in mean absolute error(MAE) under data-scarce scenarios.

Key words: Traffic flow prediction, Spatio-temporal decomposition, Hierarchical fine-tuning, Pretrained model, Urban computing

中图分类号: 

  • TP391
[1]AOUEDI O,LE V A,PIAMRAT K,et al.Deep learning on network traffic prediction:Recent advances,analysis,and future directions[J].ACM Computing Surveys,2025,57(6):1-37.
[2]LI J,LIN F,HAN G,et al.PAG-TSN:Ridership demand forecasting model for shared travel services of smart transportation[J].IEEE Transactions on Intelligent Transportation Systems,2023,24(12):15876-15889.
[3]ULVI H,YERLIKAYA M A,YILDIZ K.Urban traffic mobility optimization model:A novel mathematical approach for predictive urban traffic analysis[J].Applied Sciences,2024,14(13):5873.
[4]LI K,ZHAN T,FU K,et al.MergeNet:Knowledge Migration across Heterogeneous Models,Tasks,and Modalities[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2025:4824-4832.
[5]BOMMASANI R,HUDSON D A,ADELI E,et al.On the opportunities and risks of foundation models[J].arXiv:2108.07258,2021.
[6]SU J C,MAJI S,HARIHARAN B.When does self-supervision improve few-shot learning?[C]//European Conference on Computer Vision.Cham:Springer,2020:645-666.
[7]MIN B,ROSS H,SULEM E,et al.Recent advances in natural language processing via large pre-trained language models:A survey[J].ACM Computing Surveys,2023,56(2):1-40.
[8]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019.
[9]QIU X,SUN T,XU Y,et al.Pre-trained models for natural language processing:A survey[J].Science China Technological Sciences,2020,63(10):1872-1897.
[10]HUSSAIN B,AFZAL M K,AHMAD S,et al.Intelligent traffic flow prediction using optimized GRU model[J].IEEE Access,2021,9:100736-100746.
[11]DAI G,MA C,XU X.Short-term traffic flow prediction method for urban road sections based on space-time analysis and GRU[J].IEEE Access,2019,7:143025-143035.
[12]HUSSAINA H A,TAHER M A,MAHMOOD O A,et al.Ur-ban traffic flow estimation system based on gated recurrent unit deep learning methodology for Internet of Vehicles[J].IEEE Access,2023,11:58516-58531.
[13]WEN Y,XU P,LI Z,et al.RPConvformer:A novel Transfor-mer-based deep neural networks for traffic flow prediction[J].Expert Systems with Applications,2023,218:119587.
[14]REZA S,FERREIRA M C,MACHADO J J M,et al.A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks[J].Expert Systems with Applications,2022,202:117275.
[15]LUO Q,HE S,HAN X,et al.LSTTN:A long-short term transformer-based spatiotemporal neural network for traffic flow forecasting[J].Knowledge-Based Systems,2024,293:111637.
[16]ZHOU B,LIU J,CUI S,et al.A large-scale spatio-temporalmultimodal fusion framework for traffic prediction[J].Big Data Mining and Analytics,2024,7(3):621-636.
[17]GUO C,HWANG F J,CHEN C H,et al.Signal Decoupling Optimization for Robust Graph-Based Traffic Forecasting[J].IEEE Transactions on Industrial Informatics,2025,21(8):6158-6168.
[18]FANG Y,QIN Y,LUO H,et al.When spatio-temporal meetwavelets:Disentangled traffic forecasting via efficient spectral graph attention networks[C]//2023 IEEE 39th International Conference on Data Engineering(ICDE).IEEE,2023:517-529.
[19]SHAO Z,ZHANG Z,WEI W,et al.Decoupled dynamic spatial-temporal graph neural network for traffic forecasting[J].arXiv:2206.09112,2022.
[20]LI G,ZHONG S,DENG X,et al.A lightweight and accuratespatial-temporal transformer for traffic forecasting[J].IEEE Transactions on Knowledge and Data Engineering,2022,35(11):10967-10980.
[21]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[22]LI Y,WANG N,SHI J,et al.Revisiting batch normalization for practical domain adaptation[J].arXiv:1603.04779,2016.
[23]CHEN Y,WANG X,XU G.Gatgpt:A pre-trained large lan-guage model with graph attention network for spatiotemporal imputation[J].arXiv:2311.14332,2023.
[24]LIU C,YANG S,XU Q,et al.Spatial-temporal large languagemodel for traffic prediction[C]//2024 25th IEEE International Conference on Mobile Data Management(MDM).IEEE,2024:31-40.
[25]YU B,YIN H,ZHU Z.Spatio-temporal graph convolutional networks:A deep learning framework for traffic forecasting[J].arXiv:1709.04875,2017.
[26]LI Y,YU R,SHAHABI C,et al.Diffusion convolutional recurrent neural network:Data-driven traffic forecasting[J].arXiv:1707.01926,2017.
[27]BAI L,YAO L,LI C,et al.Adaptive graph convolutional recurrent network for traffic forecasting[J].Advances in Neural Information Processing Systems,2020,33:17804-17815.
[28]LI F,FENG J,YAN H,et al.Dynamic graph convolutional recurrent network for traffic prediction:Benchmark and solution[J].ACM Transactions on Knowledge Discovery from Data,2023,17(1):1-21.
[29]GUO S,LIN Y,FENG N,et al.Attention based spatial-temporal graph convolutional networks for traffic flow forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:922-929.
[30]GUO S,LIN Y,WAN H,et al.Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting[J].IEEE Transactions on Knowledge and Data Engineering,2021,34(11):5415-5428.
[31]ZHOU T,NIU P,SUN L,et al.One fits all:Power general time series analysis by pretrained lm[J].Advances in Neural Information Processing Systems,2023,36:43322-43355.
[1] 许身健.
跨模型协同的法律文本相关性无监督表征方法研究
Cross-model Collaborative Unsupervised Representation Method for Legal Texts
计算机科学, 2026, 53(4): 356-365. https://doi.org/10.11896/jsjkx.251100003
[2] 尹创, 刘建毅, 张茹.
跨模态融合的少样本勒索软件分类器:基于预训练模型的多模态编码
Cross-modal Fusion Few-sample Ransomware Classifier:Multimodal Encoding Based on Pre-trained Models
计算机科学, 2026, 53(4): 435-444. https://doi.org/10.11896/jsjkx.250500078
[3] 钟博洋, 阮彤, 张维彦, 刘井平.
基于大小模型结合与迭代反思框架的电子病历摘要生成方法
Collaboration of Large and Small Language Models with Iterative Reflection Framework for Clinical Note Summarization
计算机科学, 2025, 52(9): 294-302. https://doi.org/10.11896/jsjkx.241000114
[4] 高龙, 李旸, 王素格.
基于分步协作融合表示的情感分类方法
Sentiment Classification Method Based on Stepwise Cooperative Fusion Representation
计算机科学, 2025, 52(9): 313-319. https://doi.org/10.11896/jsjkx.240700161
[5] 周涛, 杜永萍, 谢润锋, 韩红桂.
基于异构合约图多维度特征深度融合的漏洞检测方法
Vulnerability Detection Method Based on Deep Fusion of Multi-dimensional Features from Heterogeneous Contract Graphs
计算机科学, 2025, 52(9): 368-375. https://doi.org/10.11896/jsjkx.241000007
[6] 陈舸, 王中卿.
结合预训练模型和数据增强的跨领域属性级情感分析研究
Cross-domain Aspect-based Sentiment Analysis Based on Pre-training Model with Data Augmentation
计算机科学, 2025, 52(8): 300-307. https://doi.org/10.11896/jsjkx.240900114
[7] 叶佳乐, 普园媛, 赵征鹏, 冯珏, 周联敏, 谷金晶.
混合对比学习和多视角CLIP的多模态图文情感分析
Multi-view CLIP and Hybrid Contrastive Learning for Multimodal Image-Text Sentiment Analysis
计算机科学, 2025, 52(6A): 240700060-7. https://doi.org/10.11896/jsjkx.240700060
[8] 李代成, 李晗, 刘哲宇, 龚诗恒.
基于MacBERT的融合依存句法信息和多视角词汇信息的中文命名实体识别方法
MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information
计算机科学, 2025, 52(6A): 240600121-8. https://doi.org/10.11896/jsjkx.240600121
[9] 方睿, 崔良中, 方圆婧.
基于语义增强的装备事件抽取方法
Equipment Event Extraction Method Based on Semantic Enhancement
计算机科学, 2025, 52(6A): 240900096-9. https://doi.org/10.11896/jsjkx.240900096
[10] 唐立军, 杨政, 赵男, 翟苏巍.
基于FLIP与联合相似性保持的跨模态哈希检索
FLIP-based Joint Similarity Preserving Hashing for Cross-modal Retrieval
计算机科学, 2025, 52(6A): 240400151-10. https://doi.org/10.11896/jsjkx.240400151
[11] 施恩译, 常舒予, 陈可佳, 张扬, 黄海平.
BiGCN-TL:软件错误部分定位场景下二分图图卷积神经网络Transformer定位模型
BiGCN-TL:Bipartite Graph Convolutional Neural Network Transformer Localization Model for Software Bug Partial Localization Scenarios
计算机科学, 2025, 52(6A): 250200086-11. https://doi.org/10.11896/jsjkx.250200086
[12] 张李政, 杨秋辉, 代声馨.
基于扰动和冻结预训练模型的程序自动修复
Automated Program Repair Based on Perturbing and Freezing Pre-trained Model
计算机科学, 2025, 52(12): 18-23. https://doi.org/10.11896/jsjkx.241100182
[13] 赵弘毅, 李志远, 卜凡亮.
基于多语言嵌入图卷积网络的仇恨言论检测方法
Multi-language Embedding Graph Convolutional Network for Hate Speech Detection
计算机科学, 2025, 52(11A): 241200023-8. https://doi.org/10.11896/jsjkx.241200023
[14] 赵卓洋, 秦董洪, 白凤波, 梁贤烨, 徐晨, 郑月华, 梁宇锋, 蓝盛, 周国平.
ZHA_TGCN:面向低资源壮文的主题分类方法
ZHA_TGCN:A Topic Classification Method for Low-resource Sawcuengh Language
计算机科学, 2025, 52(11A): 250100059-8. https://doi.org/10.11896/jsjkx.250100059
[15] 韩威, 姜淑娟, 周伟.
基于CodeBERT和Stacking集成学习的补丁正确性验证方法
Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning
计算机科学, 2025, 52(1): 250-258. https://doi.org/10.11896/jsjkx.240100019
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!