A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting
- URL: http://arxiv.org/abs/2504.21358v1
- Date: Wed, 30 Apr 2025 06:31:21 GMT
- Title: A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting
- Authors: Xiao Zheng, Saeed Asadi Bagloee, Majid Sarvi,
- Abstract summary: This paper presents a comparative study on large-scale real-world signalized arterials and freeway traffic flow datasets.<n>We develop one ensemble ML method, eXtreme Gradient Boosting (XGBoost), and a range of Deep Learning (DL) methods.<n>Time embedding is particularly effective in this context, helping naive RNN outperform Informer by 31.1% for 30-day-ahead forecasting.
- Score: 8.600212364887964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traffic forecasting is vital for Intelligent Transportation Systems, for which Machine Learning (ML) methods have been extensively explored to develop data-driven Artificial Intelligence (AI) solutions. Recent research focuses on modelling spatial-temporal correlations for short-term traffic prediction, leaving the favourable long-term forecasting a challenging and open issue. This paper presents a comparative study on large-scale real-world signalized arterials and freeway traffic flow datasets, aiming to evaluate promising ML methods in the context of large forecasting horizons up to 30 days. Focusing on modelling capacity for temporal dynamics, we develop one ensemble ML method, eXtreme Gradient Boosting (XGBoost), and a range of Deep Learning (DL) methods, including Recurrent Neural Network (RNN)-based methods and the state-of-the-art Transformer-based method. Time embedding is leveraged to enhance their understanding of seasonality and event factors. Experimental results highlight that while the attention mechanism/Transformer framework is effective for capturing long-range dependencies in sequential data, as the forecasting horizon extends, the key to effective traffic forecasting gradually shifts from temporal dependency capturing to periodicity modelling. Time embedding is particularly effective in this context, helping naive RNN outperform Informer by 31.1% for 30-day-ahead forecasting. Meanwhile, as an efficient and robust model, XGBoost, while learning solely from time features, performs competitively with DL methods. Moreover, we investigate the impacts of various factors like input sequence length, holiday traffic, data granularity, and training data size. The findings offer valuable insights and serve as a reference for future long-term traffic forecasting research and the improvement of AI's corresponding learning capabilities.
Related papers
- Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance.
Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws.
We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - FRTP: Federating Route Search Records to Enhance Long-term Traffic Prediction [1.5728609542259502]
We propose a federated architecture capable of learning from raw data with varying features and time granularities or lengths.
Our experiments focus on federating route search records and begin by processing raw data within the model framework.
The accuracy of the proposed model is demonstrated through evaluations using diverse learning patterns and parameter settings.
arXiv Detail & Related papers (2024-12-23T08:14:20Z) - Deep End-to-End Survival Analysis with Temporal Consistency [49.77103348208835]
We present a novel Survival Analysis algorithm designed to efficiently handle large-scale longitudinal data.
A central idea in our method is temporal consistency, a hypothesis that past and future outcomes in the data evolve smoothly over time.
Our framework uniquely incorporates temporal consistency into large datasets by providing a stable training signal.
arXiv Detail & Related papers (2024-10-09T11:37:09Z) - Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [27.306180426294784]
We introduce TPLLM, a novel traffic prediction framework leveraging Large Language Models (LLMs)
In this framework, we construct a sequence embedding layer based on Conal Neural Networks (LoCNNs) and a graph embedding layer based on Graph Contemporalal Networks (GCNs) to extract sequence features and spatial features.
Experiments on two real-world datasets demonstrate commendable performance in both full-sample and few-shot prediction scenarios.
arXiv Detail & Related papers (2024-03-04T17:08:57Z) - Predicting the Skies: A Novel Model for Flight-Level Passenger Traffic
Forecasting [0.0]
This study introduces a novel, multimodal deep learning approach to the challenge of predicting flight-level passenger traffic.
Our model ingests historical traffic data, fare closure information, and seasonality attributes specific to each flight.
Our model demonstrates an approximate 33% improvement in Mean Squared Error compared to traditional benchmarks.
arXiv Detail & Related papers (2024-01-07T06:51:26Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for
Traffic Flow Prediction [78.05103666987655]
spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem.
We propose a novel propagation delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction.
Our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency.
arXiv Detail & Related papers (2023-01-19T08:42:40Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - Traffic congestion anomaly detection and prediction using deep learning [6.370406399003785]
Congestion prediction is a major priority for traffic management centres around the world to ensure timely incident response handling.
The increasing amounts of generated traffic data have been used to train machine learning predictors for traffic, but this is a challenging task due to inter-dependencies of traffic flow both in time and space.
We show that our deep learning models consistently outperform traditional methods, and we conduct a comparative analysis of the optimal time horizon of historical data required to predict traffic flow at different time points in the future.
arXiv Detail & Related papers (2020-06-23T08:49:46Z) - Deep Echo State Networks for Short-Term Traffic Forecasting: Performance
Comparison and Statistical Assessment [8.586891288891263]
In short-term traffic forecasting, the goal is to accurately predict future values of a traffic parameter of interest.
Deep Echo State Networks achieve more accurate traffic forecasts than the rest of considered modeling counterparts.
arXiv Detail & Related papers (2020-04-17T11:07:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.