When Simpler Wins: Facebooks Prophet vs LSTM for Air Pollution Forecasting in Data-Constrained Northern Nigeria
- URL: http://arxiv.org/abs/2508.16244v1
- Date: Fri, 22 Aug 2025 09:23:59 GMT
- Title: When Simpler Wins: Facebooks Prophet vs LSTM for Air Pollution Forecasting in Data-Constrained Northern Nigeria
- Authors: Habeeb Balogun, Yahaya Zakari,
- Abstract summary: This study evaluates Long Short-Term Memory (LSTM) networks and the Facebook Prophet model for forecasting multiple pollutants.<n>Results show that Prophet often matches or exceeds LSTM's accuracy, particularly in series dominated by seasonal and long-term trends.
- Score: 0.44198435146063364
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Air pollution forecasting is critical for proactive environmental management, yet data irregularities and scarcity remain major challenges in low-resource regions. Northern Nigeria faces high levels of air pollutants, but few studies have systematically compared the performance of advanced machine learning models under such constraints. This study evaluates Long Short-Term Memory (LSTM) networks and the Facebook Prophet model for forecasting multiple pollutants (CO, SO2, SO4) using monthly observational data from 2018 to 2023 across 19 states. Results show that Prophet often matches or exceeds LSTM's accuracy, particularly in series dominated by seasonal and long-term trends, while LSTM performs better in datasets with abrupt structural changes. These findings challenge the assumption that deep learning models inherently outperform simpler approaches, highlighting the importance of model-data alignment. For policymakers and practitioners in resource-constrained settings, this work supports adopting context-sensitive, computationally efficient forecasting methods over complexity for its own sake.
Related papers
- Echo State Networks for Time Series Forecasting: Hyperparameter Sweep and Benchmarking [51.56484100374058]
We evaluate whether a fully automatic, purely feedback-driven ESN can serve as a competitive alternative to widely used statistical forecasting methods.<n>Forecast accuracy is measured using MASE and sMAPE and benchmarked against simple benchmarks like drift and seasonal naive and statistical models.
arXiv Detail & Related papers (2026-02-03T16:01:22Z) - Predictive Modeling of Power Outages during Extreme Events: Integrating Weather and Socio-Economic Factors [0.4640835690336653]
This paper presents a novel learning-based framework for predicting power outages caused by extreme events.<n>It targets low-probability, high-consequence outage scenarios and leverages a comprehensive set of features derived from publicly available data sources.<n>Four machine learning models (Random Forest (RF), Support Vector Machine (SVM), Adaptive Boosting (AdaBoost), and Long Short-Term Memory (LSTM)) are evaluated.
arXiv Detail & Related papers (2025-12-27T20:30:07Z) - Beyond the Hype: Comparing Lightweight and Deep Learning Models for Air Quality Forecasting [1.2744523252873352]
This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter in Beijing, China.<n>Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits.<n>Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R2$ above 0.94 for both pollutants.
arXiv Detail & Related papers (2025-12-09T19:39:45Z) - Synergistic Neural Forecasting of Air Pollution with Stochastic Sampling [50.3911487821783]
Air pollution remains a leading global health and environmental risk, particularly in regions vulnerable to episodic air pollution spikes due to wildfires, urban haze and dust storms.<n>Here, we present SynCast, a high-resolution neural forecasting model that integrates meteorological and air composition data to improve predictions of both average and extreme pollution levels.
arXiv Detail & Related papers (2025-10-28T01:18:00Z) - Extreme value forecasting using relevance-based data augmentation with deep learning models [3.503370263836711]
In this study, we present a data augmentation framework for extreme value forecasting.<n>We use deep learning models in combination with data augmentation models such as GANs and synthetic minority oversampling technique (SMOTE)<n>Our results indicate that the SMOTE-based strategy consistently demonstrated superior adaptability, leading to improved performance across both short and long-horizon forecasts.
arXiv Detail & Related papers (2025-10-02T06:10:27Z) - Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model [20.98899316909536]
We introduce Baguan, a novel data-driven model for medium-range weather forecasting built on a Siamese Autoencoder pre-trained in a self-supervised manner.<n> Experimental results show that Baguan outperforms traditional methods, delivering more accurate forecasts.
arXiv Detail & Related papers (2025-05-20T03:29:23Z) - Accurate Prediction of Temperature Indicators in Eastern China Using a Multi-Scale CNN-LSTM-Attention model [0.0]
We propose a weather prediction model based on a multi-scale convolutional CNN-LSTM-Attention architecture.<n>The model integrates Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and attention mechanisms.<n> Experimental results show that the model performs excellently in predicting temperature trends with high accuracy.
arXiv Detail & Related papers (2024-12-11T00:42:31Z) - PIAD-SRNN: Physics-Informed Adaptive Decomposition in State-Space RNN [1.3654846342364306]
Time series forecasting often demands a trade-off between accuracy and efficiency.<n>We propose PIAD-SRNN, a physics-informed adaptive decomposition state-space RNN.<n>We evaluate PIAD-SRNN's performance on indoor air quality datasets.
arXiv Detail & Related papers (2024-12-01T22:55:58Z) - Deep Learning for Weather Forecasting: A CNN-LSTM Hybrid Model for Predicting Historical Temperature Data [7.559331742876793]
This study introduces a hybrid model combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to predict historical temperature data.
CNNs are utilized for spatial feature extraction, while LSTMs handle temporal dependencies, resulting in significantly improved prediction accuracy and stability.
arXiv Detail & Related papers (2024-10-19T03:38:53Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Stream-Flow Forecasting of Small Rivers Based on LSTM [3.921808417990452]
This paper tries to provide a new method to do the forecast using the Long-Short Term Memory (LSTM) deep learning model.
We collected the stream flow data from one hydrologic station in Tunxi, China, and precipitation data from 11 rainfall stations around to forecast the stream flow data.
We evaluated the prediction results using three criteria: root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2)
arXiv Detail & Related papers (2020-01-16T07:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.