Related papers: A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting

A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting

URL: http://arxiv.org/abs/2504.21358v1
Date: Wed, 30 Apr 2025 06:31:21 GMT
Title: A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting
Authors: Xiao Zheng, Saeed Asadi Bagloee, Majid Sarvi,
Abstract summary: This paper presents a comparative study on large-scale real-world signalized arterials and freeway traffic flow datasets.<n>We develop one ensemble ML method, eXtreme Gradient Boosting (XGBoost), and a range of Deep Learning (DL) methods.<n>Time embedding is particularly effective in this context, helping naive RNN outperform Informer by 31.1% for 30-day-ahead forecasting.
Score: 8.600212364887964
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traffic forecasting is vital for Intelligent Transportation Systems, for which Machine Learning (ML) methods have been extensively explored to develop data-driven Artificial Intelligence (AI) solutions. Recent research focuses on modelling spatial-temporal correlations for short-term traffic prediction, leaving the favourable long-term forecasting a challenging and open issue. This paper presents a comparative study on large-scale real-world signalized arterials and freeway traffic flow datasets, aiming to evaluate promising ML methods in the context of large forecasting horizons up to 30 days. Focusing on modelling capacity for temporal dynamics, we develop one ensemble ML method, eXtreme Gradient Boosting (XGBoost), and a range of Deep Learning (DL) methods, including Recurrent Neural Network (RNN)-based methods and the state-of-the-art Transformer-based method. Time embedding is leveraged to enhance their understanding of seasonality and event factors. Experimental results highlight that while the attention mechanism/Transformer framework is effective for capturing long-range dependencies in sequential data, as the forecasting horizon extends, the key to effective traffic forecasting gradually shifts from temporal dependency capturing to periodicity modelling. Time embedding is particularly effective in this context, helping naive RNN outperform Informer by 31.1% for 30-day-ahead forecasting. Meanwhile, as an efficient and robust model, XGBoost, while learning solely from time features, performs competitively with DL methods. Moreover, we investigate the impacts of various factors like input sequence length, holiday traffic, data granularity, and training data size. The findings offer valuable insights and serve as a reference for future long-term traffic forecasting research and the improvement of AI's corresponding learning capabilities.

Related papers

Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance. Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws. We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
FRTP: Federating Route Search Records to Enhance Long-term Traffic Prediction [1.5728609542259502]
We propose a federated architecture capable of learning from raw data with varying features and time granularities or lengths. Our experiments focus on federating route search records and begin by processing raw data within the model framework. The accuracy of the proposed model is demonstrated through evaluations using diverse learning patterns and parameter settings.
arXiv Detail & Related papers (2024-12-23T08:14:20Z)
Deep End-to-End Survival Analysis with Temporal Consistency [49.77103348208835]
We present a novel Survival Analysis algorithm designed to efficiently handle large-scale longitudinal data. A central idea in our method is temporal consistency, a hypothesis that past and future outcomes in the data evolve smoothly over time. Our framework uniquely incorporates temporal consistency into large datasets by providing a stable training signal.
arXiv Detail & Related papers (2024-10-09T11:37:09Z)
Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models. Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML) We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z)
TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [27.306180426294784]
We introduce TPLLM, a novel traffic prediction framework leveraging Large Language Models (LLMs) In this framework, we construct a sequence embedding layer based on Conal Neural Networks (LoCNNs) and a graph embedding layer based on Graph Contemporalal Networks (GCNs) to extract sequence features and spatial features. Experiments on two real-world datasets demonstrate commendable performance in both full-sample and few-shot prediction scenarios.
arXiv Detail & Related papers (2024-03-04T17:08:57Z)
Predicting the Skies: A Novel Model for Flight-Level Passenger Traffic Forecasting [0.0]
This study introduces a novel, multimodal deep learning approach to the challenge of predicting flight-level passenger traffic. Our model ingests historical traffic data, fare closure information, and seasonality attributes specific to each flight. Our model demonstrates an approximate 33% improvement in Mean Squared Error compared to traditional benchmarks.
arXiv Detail & Related papers (2024-01-07T06:51:26Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction [78.05103666987655]
spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem. We propose a novel propagation delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction. Our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency.
arXiv Detail & Related papers (2023-01-19T08:42:40Z)
Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)
Traffic congestion anomaly detection and prediction using deep learning [6.370406399003785]
Congestion prediction is a major priority for traffic management centres around the world to ensure timely incident response handling. The increasing amounts of generated traffic data have been used to train machine learning predictors for traffic, but this is a challenging task due to inter-dependencies of traffic flow both in time and space. We show that our deep learning models consistently outperform traditional methods, and we conduct a comparative analysis of the optimal time horizon of historical data required to predict traffic flow at different time points in the future.
arXiv Detail & Related papers (2020-06-23T08:49:46Z)
Deep Echo State Networks for Short-Term Traffic Forecasting: Performance Comparison and Statistical Assessment [8.586891288891263]
In short-term traffic forecasting, the goal is to accurately predict future values of a traffic parameter of interest. Deep Echo State Networks achieve more accurate traffic forecasts than the rest of considered modeling counterparts.
arXiv Detail & Related papers (2020-04-17T11:07:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.