Forecasting MBTA Transit Dynamics: A Performance Benchmarking of Statistical and Machine Learning Models
- URL: http://arxiv.org/abs/2512.02336v1
- Date: Tue, 02 Dec 2025 02:15:04 GMT
- Title: Forecasting MBTA Transit Dynamics: A Performance Benchmarking of Statistical and Machine Learning Models
- Authors: Sai Siddharth Nalamalpu, Kaining Yuan, Aiden Zhou, Eugene Pinsky,
- Abstract summary: The Massachusetts Bay Transportation Authority (MBTA) is the main public transit provider in Boston.<n>This paper compares the performance of existing and unique methods to determine the best approach in predicting gated station entries in the subway system.<n>It is found that providing either day of week or season data has a more substantial benefit to predictive accuracy compared to weather data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Massachusetts Bay Transportation Authority (MBTA) is the main public transit provider in Boston, operating multiple means of transport, including trains, subways, and buses. However, the system often faces delays and fluctuations in ridership volume, which negatively affect efficiency and passenger satisfaction. To further understand this phenomenon, this paper compares the performance of existing and unique methods to determine the best approach in predicting gated station entries in the subway system (a proxy for subway usage) and the number of delays in the overall MBTA system. To do so, this research considers factors that tend to affect public transportation, such as day of week, season, pressure, wind speed, average temperature, and precipitation. This paper evaluates the performance of 10 statistical and machine learning models on predicting next-day subway usage. On predicting delay count, the number of models is extended to 11 per day by introducing a self-exciting point process model, representing a unique application of a point-process framework for MBTA delay modeling. This research involves experimenting with the selective inclusion of features to determine feature importance, testing model accuracy via Root Mean Squared Error (RMSE). Remarkably, it is found that providing either day of week or season data has a more substantial benefit to predictive accuracy compared to weather data; in fact, providing weather data generally worsens performance, suggesting a tendency of models to overfit.
Related papers
- Valeo4Cast: A Modular Approach to End-to-End Forecasting [93.86257326005726]
Our solution ranks first in the Argoverse 2 End-to-end Forecasting Challenge, with 63.82 mAPf.
We depart from the current trend of tackling this task via end-to-end training from perception to forecasting, and instead use a modular approach.
We surpass forecasting results by +17.1 points over last year's winner and by +13.3 points over this year's runner-up.
arXiv Detail & Related papers (2024-06-12T11:50:51Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - How to Estimate Model Transferability of Pre-Trained Speech Models? [84.11085139766108]
"Score-based assessment" framework for estimating transferability of pre-trained speech models.
We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates.
Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers.
arXiv Detail & Related papers (2023-06-01T04:52:26Z) - Exploring the impact of weather on Metro demand forecasting using
machine learning method [1.602570550027996]
This study uses real passenger flow data of an Asian subway system from April to June of 2018.
It analyzes the space-time distribution of the passenger flow using short-term traffic flow prediction.
arXiv Detail & Related papers (2022-10-24T13:01:47Z) - On Designing Day Ahead and Same Day Ridership Level Prediction Models
for City-Scale Transit Networks Using Noisy APC Data [0.0]
We propose the use and fusion of data from multiple sources, cleaned, processed, and merged together, for use in training machine learning models to predict transit ridership.
We evaluate our approach on real-world transit data provided by the public transit agency of Nashville, TN.
arXiv Detail & Related papers (2022-10-10T19:50:59Z) - Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of
Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars.
They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z) - Public Transit for Special Events: Ridership Prediction and Train
Optimization [10.531110013870792]
It is important for transit providers to understand their impact on disruptions, delays, and fare revenues.
This paper proposes a suite of data-driven techniques for evaluating, anticipating, and managing the performance of transit systems during recurring congestion peaks due to special events.
arXiv Detail & Related papers (2021-06-09T19:52:18Z) - A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents.
We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Crowding Prediction of In-Situ Metro Passengers Using Smart Card Data [11.781685156308475]
We propose a statistical model to predict in-situ passenger density inside a closed metro system.
Based on the prediction results, we are able to provide accurate prediction of in-situ passenger density for a future time point.
arXiv Detail & Related papers (2020-09-07T04:07:37Z) - BusTime: Which is the Right Prediction Model for My Bus Arrival Time? [3.1761486589684975]
This paper tries to fill this gap by proposing a general and practical evaluation framework for analysing various widely used prediction models.
In particular, this framework contains a raw bus GPS data pre-processing method that needs much less number of input data points.
We also present preliminary results for city managers by analysing the practical strengths and weaknesses in both training and predicting stages of commonly used prediction models.
arXiv Detail & Related papers (2020-03-20T17:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.