Forecasting the Future with Yesterday's Climate: Temperature Bias in AI Weather and Climate Models
- URL: http://arxiv.org/abs/2509.22359v1
- Date: Fri, 26 Sep 2025 13:55:29 GMT
- Title: Forecasting the Future with Yesterday's Climate: Temperature Bias in AI Weather and Climate Models
- Authors: Jacob B. Landsberg, Elizabeth A. Barnes,
- Abstract summary: We analyze boreal winter land temperature biases in AI weather and climate models.<n>We find that all three models produce cold-biased mean temperatures, resembling climates from 15-20 years earlier than the period they are predicting.<n>In some regions, like the Eastern U.S., the predictions resemble climates from as much as 20-30 years earlier.
- Score: 0.31486469212981216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI-based climate and weather models have rapidly gained popularity, providing faster forecasts with skill that can match or even surpass that of traditional dynamical models. Despite this success, these models face a key challenge: predicting future climates while being trained only with historical data. In this study, we investigate this issue by analyzing boreal winter land temperature biases in AI weather and climate models. We examine two weather models, FourCastNet V2 Small (FourCastNet) and Pangu Weather (Pangu), evaluating their predictions for 2020-2025 and Ai2 Climate Emulator version 2 (ACE2) for 1996-2010. These time periods lie outside of the respective models' training sets and are significantly more recent than the bulk of their training data, allowing us to assess how well the models generalize to new, i.e. more modern, conditions. We find that all three models produce cold-biased mean temperatures, resembling climates from 15-20 years earlier than the period they are predicting. In some regions, like the Eastern U.S., the predictions resemble climates from as much as 20-30 years earlier. Further analysis shows that FourCastNet's and Pangu's cold bias is strongest in the hottest predicted temperatures, indicating limited training exposure to modern extreme heat events. In contrast, ACE2's bias is more evenly distributed but largest in regions, seasons, and parts of the temperature distribution where climate change has been most pronounced. These findings underscore the challenge of training AI models exclusively on historical data and highlight the need to account for such biases when applying them to future climate prediction.
Related papers
- Numerical models outperform AI weather forecasts of record-breaking extremes [0.18749305679160366]
We show that for record-breaking weather extremes, the numerical model High RESolution forecast still consistently outperforms state-of-the-art AI models.<n>We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times.
arXiv Detail & Related papers (2025-08-21T17:07:16Z) - Turning Up the Heat: Assessing 2-m Temperature Forecast Errors in AI Weather Prediction Models During Heat Waves [0.732482777758295]
Extreme heat is the deadliest weather-related hazard in the United States.<n>Traditional numerical weather prediction models struggle with extreme heat for medium-range and subseasonal-to-seasonal timescales.<n>It is largely unknown how well artificial intelligence-based weather prediction models forecast extremes.
arXiv Detail & Related papers (2025-04-29T22:02:32Z) - ClimateBench-M: A Multi-Modal Climate Data Benchmark with a Simple Generative Method [61.76389719956301]
We contribute a multi-modal climate benchmark, i.e., ClimateBench-M, which aligns time series climate data from ERA5, extreme weather events data from NOAA, and satellite image data from NASA.<n>Under each data modality, we also propose a simple but strong generative method that could produce competitive performance in weather forecasting, thunderstorm alerts, and crop segmentation tasks.
arXiv Detail & Related papers (2025-04-10T02:22:23Z) - Data-driven Seasonal Climate Predictions via Variational Inference and Transformers [31.98107454758077]
We train generative models on climate model output for seasonal predictions.<n>We analyse the method's performance in predicting interannual anomalies beyond the climate change-induced trend.
arXiv Detail & Related papers (2025-03-26T11:51:23Z) - FengWu-W2S: A deep learning model for seamless weather-to-subseasonal forecast of global atmosphere [53.22497376154084]
We propose FengWu-Weather to Subseasonal (FengWu-W2S), which builds on the FengWu global weather forecast model and incorporates an ocean-atmosphere-land coupling structure along with a diverse perturbation strategy.
Our hindcast results demonstrate that FengWu-W2S reliably predicts atmospheric conditions out to 3-6 weeks ahead, enhancing predictive capabilities for global surface air temperature, precipitation, geopotential height and intraseasonal signals such as the Madden-Julian Oscillation (MJO) and North Atlantic Oscillation (NAO)
Our ablation experiments on forecast error growth from daily to seasonal timescales reveal potential
arXiv Detail & Related papers (2024-11-15T13:44:37Z) - Robustness of AI-based weather forecasts in a changing climate [1.4779266690741741]
We show that current state-of-the-art machine learning models trained for weather forecasting in present-day climate produce skillful forecasts across different climate states.
Despite current limitations, our results suggest that data-driven machine learning models will provide powerful tools for climate science.
arXiv Detail & Related papers (2024-09-27T08:11:49Z) - Validating Deep Learning Weather Forecast Models on Recent High-Impact Extreme Events [0.1747623282473278]
We compare machine learning weather prediction models and ECMWF's high-resolution forecast system.<n>We find that ML weather prediction models locally achieve similar accuracy to HRES on the record-shattering Pacific Northwest heatwave.<n>We also highlight structural differences in how the errors of HRES and the ML models build up to that event.
arXiv Detail & Related papers (2024-04-26T18:18:25Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather
Forecasting [56.73502043159699]
This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09$circ$ horizontal resolution.
It introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a low-resolution model.
The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES.
arXiv Detail & Related papers (2024-01-28T13:23:25Z) - Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs.
Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative.
The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z) - GraphCast: Learning skillful medium-range global weather forecasting [107.40054095223779]
We introduce a machine learning-based method called "GraphCast", which can be trained directly from reanalysis data.
It predicts hundreds of weather variables, over 10 days at 0.25 degree resolution globally, in under one minute.
We show that GraphCast significantly outperforms the most accurate operational deterministic systems on 90% of 1380 verification targets.
arXiv Detail & Related papers (2022-12-24T18:15:39Z) - MLRM: A Multiple Linear Regression based Model for Average Temperature
Prediction of A Day [3.6704226968275258]
We aim to predict the weather of an area using past meteorological data and features using the Multiple Linear Regression Model.
The model is successfully able to predict the average temperature of a day with an error of 2.8 degrees Celsius.
arXiv Detail & Related papers (2022-03-11T10:22:57Z) - A generative adversarial network approach to (ensemble) weather
prediction [91.3755431537592]
We use a conditional deep convolutional generative adversarial network to predict the geopotential height of the 500 hPa pressure level, the two-meter temperature and the total precipitation for the next 24 hours over Europe.
The proposed models are trained on 4 years of ERA5 reanalysis data from 2015-2018 with the goal to predict the associated meteorological fields in 2019.
arXiv Detail & Related papers (2020-06-13T20:53:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.