Long-Range Distillation: Distilling 10,000 Years of Simulated Climate into Long Timestep AI Weather Models
- URL: http://arxiv.org/abs/2512.22814v1
- Date: Sun, 28 Dec 2025 07:03:20 GMT
- Title: Long-Range Distillation: Distilling 10,000 Years of Simulated Climate into Long Timestep AI Weather Models
- Authors: Scott A. Martin, Noah Brenowitz, Dale Durran, Michael Pritchard,
- Abstract summary: Long-range distillation is a method that trains a long-timestep probabilistic "student" model to forecast directly at long-range.<n>We generate over 10,000 years of simulated climate to train models for forecasting across a range of timescales.<n>In perfect-model experiments, the distilled models outperform climatology and approach the skill of their autoregressive teacher.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate long-range weather forecasting remains a major challenge for AI models, both because errors accumulate over autoregressive rollouts and because reanalysis datasets used for training offer a limited sample of the slow modes of climate variability underpinning predictability. Most AI weather models are autoregressive, producing short lead forecasts that must be repeatedly applied to reach subseasonal-to-seasonal (S2S) or seasonal lead times, often resulting in instability and calibration issues. Long-timestep probabilistic models that generate long-range forecasts in a single step offer an attractive alternative, but training on the 40-year reanalysis record leads to overfitting, suggesting orders of magnitude more training data are required. We introduce long-range distillation, a method that trains a long-timestep probabilistic "student" model to forecast directly at long-range using a huge synthetic training dataset generated by a short-timestep autoregressive "teacher" model. Using the Deep Learning Earth System Model (DLESyM) as the teacher, we generate over 10,000 years of simulated climate to train distilled student models for forecasting across a range of timescales. In perfect-model experiments, the distilled models outperform climatology and approach the skill of their autoregressive teacher while replacing hundreds of autoregressive steps with a single timestep. In the real world, they achieve S2S forecast skill comparable to the ECMWF ensemble forecast after ERA5 fine-tuning. The skill of our distilled models scales with increasing synthetic training data, even when that data is orders of magnitude larger than ERA5. This represents the first demonstration that AI-generated synthetic training data can be used to scale long-range forecast skill.
Related papers
- AtmosMJ: Revisiting Gating Mechanism for AI Weather Forecasting Beyond the Year Scale [4.8951183832371]
We introduce a deep convolutional network that operates directly on ERA5 data without any spherical remapping.<n>Our results demonstrate that AtmosMJ produces stable and physically plausible forecasts for about 500 days.
arXiv Detail & Related papers (2025-06-11T13:38:56Z) - A Hybrid Deep-Learning Model for El NiƱo Southern Oscillation in the Low-Data Regime [0.0]
El Nino Southern Oscillation (ENSO) forecasts can be made up to one year in advance.<n>Deep-learning models are predominantly trained on climate model simulations that provide thousands of years of training data.<n>This motivates a hybrid approach, combining the LIMs modest data needs with a deep-learning non-Markovian correction of the LIM.<n>For O(100 yr) datasets, our resulting Hybrid model is more skillful than the LIM while also exceeding the skill of a full deep-learning model.
arXiv Detail & Related papers (2024-12-04T22:23:17Z) - On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - Robustness of AI-based weather forecasts in a changing climate [1.4779266690741741]
We show that current state-of-the-art machine learning models trained for weather forecasting in present-day climate produce skillful forecasts across different climate states.
Despite current limitations, our results suggest that data-driven machine learning models will provide powerful tools for climate science.
arXiv Detail & Related papers (2024-09-27T08:11:49Z) - Ensemble data assimilation to diagnose AI-based weather prediction model: A case with ClimaX version 0.3.1 [0.0]
This study proposes using ensemble data assimilation for diagnosing AI-based weather prediction models.<n>Experiments with an AI-based model ClimaX demonstrated that the ensemble data assimilation cycled stably for the AI-based weather prediction model.
arXiv Detail & Related papers (2024-07-25T05:22:08Z) - Weather Prediction with Diffusion Guided by Realistic Forecast Processes [49.07556359513563]
We introduce a novel method that applies diffusion models (DM) for weather forecasting.
Our method can achieve both direct and iterative forecasting with the same modeling framework.
The flexibility and controllability of our model empowers a more trustworthy DL system for the general weather community.
arXiv Detail & Related papers (2024-02-06T21:28:42Z) - FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation [67.20588721130623]
We develop an AI-based cyclic weather forecasting system, FengWu-4DVar.
FengWu-4DVar can incorporate observational data into the data-driven weather forecasting model.
Experiments on the simulated observational dataset demonstrate that FengWu-4DVar is capable of generating reasonable analysis fields.
arXiv Detail & Related papers (2023-12-16T02:07:56Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Long-term stability and generalization of observationally-constrained
stochastic data-driven models for geophysical turbulence [0.19686770963118383]
Deep learning models can mitigate certain biases in current state-of-the-art weather models.
Data-driven models require a lot of training data which may not be available from reanalysis (observational data) products.
deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift.
We propose a convolutional variational autoencoder-based data-driven model that is pre-trained on an imperfect climate model simulation.
arXiv Detail & Related papers (2022-05-09T23:52:37Z) - A generative adversarial network approach to (ensemble) weather
prediction [91.3755431537592]
We use a conditional deep convolutional generative adversarial network to predict the geopotential height of the 500 hPa pressure level, the two-meter temperature and the total precipitation for the next 24 hours over Europe.
The proposed models are trained on 4 years of ERA5 reanalysis data from 2015-2018 with the goal to predict the associated meteorological fields in 2019.
arXiv Detail & Related papers (2020-06-13T20:53:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.