Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
- URL: http://arxiv.org/abs/2505.13873v1
- Date: Tue, 20 May 2025 03:29:23 GMT
- Title: Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
- Authors: Peisong Niu, Ziqing Ma, Tian Zhou, Weiqi Chen, Lefei Shen, Rong Jin, Liang Sun,
- Abstract summary: We introduce Baguan, a novel data-driven model for medium-range weather forecasting built on a Siamese Autoencoder pre-trained in a self-supervised manner.<n> Experimental results show that Baguan outperforms traditional methods, delivering more accurate forecasts.
- Score: 20.98899316909536
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Weather forecasting has long posed a significant challenge for humanity. While recent AI-based models have surpassed traditional numerical weather prediction (NWP) methods in global forecasting tasks, overfitting remains a critical issue due to the limited availability of real-world weather data spanning only a few decades. Unlike fields like computer vision or natural language processing, where data abundance can mitigate overfitting, weather forecasting demands innovative strategies to address this challenge with existing data. In this paper, we explore pre-training methods for weather forecasting, finding that selecting an appropriately challenging pre-training task introduces locality bias, effectively mitigating overfitting and enhancing performance. We introduce Baguan, a novel data-driven model for medium-range weather forecasting, built on a Siamese Autoencoder pre-trained in a self-supervised manner and fine-tuned for different lead times. Experimental results show that Baguan outperforms traditional methods, delivering more accurate forecasts. Additionally, the pre-trained Baguan demonstrates robust overfitting control and excels in downstream tasks, such as subseasonal-to-seasonal (S2S) modeling and regional forecasting, after fine-tuning.
Related papers
- Generative assimilation and prediction for weather and climate [9.319028023682494]
We introduce Generative Assimilation and Prediction (GAP)<n>GAP is a unified framework for assimilation and prediction of both weather and climate.<n>It excels in a broad range of weather-climate related tasks, including data assimilation, seamless prediction, and climate simulation.
arXiv Detail & Related papers (2025-03-04T22:36:29Z) - Data driven weather forecasts trained and initialised directly from observations [1.44556167750856]
Skilful Machine Learned weather forecasts have challenged our approach to numerical weather prediction.
Data-driven systems have been trained to forecast future weather by learning from long historical records of past weather.
We propose a new approach, training a neural network to predict future weather purely from historical observations.
arXiv Detail & Related papers (2024-07-22T12:23:26Z) - VarteX: Enhancing Weather Forecast through Distributed Variable Representation [5.2980803808373516]
Recent data-driven models have outperformed numerical weather prediction by utilizing deep learning in forecasting performance.
This study proposes a new variable aggregation scheme and an efficient learning framework for that challenge.
arXiv Detail & Related papers (2024-06-28T02:42:30Z) - Uncertainty quantification for data-driven weather models [0.0]
We study and compare uncertainty quantification methods to generate probabilistic weather forecasts from a state-of-the-art deterministic data-driven weather model, Pangu-Weather.<n>Specifically, we compare approaches for quantifying forecast uncertainty based on generating ensemble forecasts via perturbations to the initial conditions.<n>In a case study on medium-range forecasts of selected weather variables over Europe, the probabilistic forecasts obtained by using the Pangu-Weather model in concert with uncertainty quantification methods show promising results.
arXiv Detail & Related papers (2024-03-20T10:07:51Z) - Weather Prediction with Diffusion Guided by Realistic Forecast Processes [49.07556359513563]
We introduce a novel method that applies diffusion models (DM) for weather forecasting.
Our method can achieve both direct and iterative forecasting with the same modeling framework.
The flexibility and controllability of our model empowers a more trustworthy DL system for the general weather community.
arXiv Detail & Related papers (2024-02-06T21:28:42Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather
Forecasting [56.73502043159699]
This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09$circ$ horizontal resolution.
It introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a low-resolution model.
The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES.
arXiv Detail & Related papers (2024-01-28T13:23:25Z) - Scaling transformer neural networks for skillful and reliable medium-range weather forecasting [23.249955524044392]
We introduce Stormer, a state-of-the-art performance on weather forecasting with minimal changes to the standard transformer backbone.
At the core of Stormer is a randomized forecasting objective that trains the model to forecast the weather dynamics over varying time intervals.
On WeatherBench 2, Stormer performs competitively at short to medium-range forecasts and outperforms current methods beyond 7 days.
arXiv Detail & Related papers (2023-12-06T19:46:06Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - A case study of spatiotemporal forecasting techniques for weather forecasting [4.347494885647007]
The correlations of real-world processes aretemporal, and the data generated by them exhibits both spatial and temporal evolution.
Time series-based models are a viable alternative to numerical forecasts.
We show that decompositiontemporal prediction models reduced computational costs while improving accuracy.
arXiv Detail & Related papers (2022-09-29T13:47:02Z) - Forecasting large-scale circulation regimes using deformable
convolutional neural networks and global spatiotemporal climate data [86.1450118623908]
We investigate a supervised machine learning approach based on deformable convolutional neural networks (deCNNs)
We forecast the North Atlantic-European weather regimes during extended boreal winter for 1 to 15 days into the future.
Due to its wider field of view, we also observe deCNN achieving considerably better performance than regular convolutional neural networks at lead times beyond 5-6 days.
arXiv Detail & Related papers (2022-02-10T11:37:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.