Benchmarking atmospheric circulation variability in an AI emulator, ACE2, and a hybrid model, NeuralGCM
- URL: http://arxiv.org/abs/2510.04466v1
- Date: Mon, 06 Oct 2025 03:42:18 GMT
- Title: Benchmarking atmospheric circulation variability in an AI emulator, ACE2, and a hybrid model, NeuralGCM
- Authors: Ian Baxter, Hamid Pahlavan, Pedram Hassanzadeh, Katharine Rucker, Tiffany Shaw,
- Abstract summary: Physics-based atmosphere-land models with prescribed sea surface temperature have notable successes but also biases in their ability to represent atmospheric variability compared to observations.<n>Recently, AI emulators and hybrid models have emerged with the potential to overcome these biases, but still require systematic evaluation against metrics grounded in fundamental atmospheric dynamics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physics-based atmosphere-land models with prescribed sea surface temperature have notable successes but also biases in their ability to represent atmospheric variability compared to observations. Recently, AI emulators and hybrid models have emerged with the potential to overcome these biases, but still require systematic evaluation against metrics grounded in fundamental atmospheric dynamics. Here, we evaluate the representation of four atmospheric variability benchmarking metrics in a fully data-driven AI emulator (ACE2-ERA5) and hybrid model (NeuralGCM). The hybrid model and emulator can capture the spectra of large-scale tropical waves and extratropical eddy-mean flow interactions, including critical levels. However, both struggle to capture the timescales associated with quasi-biennial oscillation (QBO, $\sim 28$ months) and Southern annular mode propagation ($\sim 150$ days). These dynamical metrics serve as an initial benchmarking tool to inform AI model development and understand their limitations, which may be essential for out-of-distribution applications (e.g., extrapolating to unseen climates).
Related papers
- Hierarchical Testing of a Hybrid Machine Learning-Physics Global Atmosphere Model [9.985969370583426]
Machine learning (ML)-based models have demonstrated high skill and computational efficiency, often outperforming conventional physics-based models in weather and subseasonal predictions.<n>Here, we design three sets of experiments targeting synoptic-scale phenomena, interannual variability, and out-of-distribution uniform-warming forcings.<n>We evaluate the Neural General Circulation Model (NeuralGCM), a hybrid model integrating a dynamical core with ML-based component, against observations and physics-based Earth system models (ESMs).
arXiv Detail & Related papers (2026-02-11T19:34:50Z) - Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics [51.85385061275941]
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics.<n>Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation.<n>We present STAR-MD, a scalable diffusion model that generates physically plausible protein trajectories over micro-scale timescales.
arXiv Detail & Related papers (2026-02-02T14:13:28Z) - AgroFlux: A Spatial-Temporal Benchmark for Carbon and Nitrogen Flux Prediction in Agricultural Ecosystems [32.91715282741263]
We introduce a first-of-its-kind spatial-temporal agroecosystem GHG benchmark dataset.<n>We evaluate the performance of various sequential deep learning models on carbon and nitrogen flux prediction.<n>Our benchmark dataset and evaluation framework contribute to the development of more accurate and scalable AI-driven agroecosystem models.
arXiv Detail & Related papers (2026-02-02T04:04:07Z) - Score-based generative emulation of impact-relevant Earth system model outputs [2.2940141855172036]
Policy targets evolve faster than the Couple Model Intercomparison Project cycles.<n>We show that deep generative models have the potential to model jointly the distribution of variables relevant for impacts.<n>We evaluate performance across three distinct ESMs in both pre-industrial and forced regimes.
arXiv Detail & Related papers (2025-10-05T20:54:19Z) - Atmospheric Transport Modeling of CO$_2$ with Neural Networks [46.26819563674888]
Accurately describing the distribution of CO$$ in the atmosphere with atmospheric tracer transport models is essential for greenhouse gas monitoring and verification support systems.
Large deep neural networks are poised to revolutionize weather prediction, which requires 3D modeling of the atmosphere.
In this study we explore four different deep neural networks which have proven as state-of-the-art in weather prediction to assess their usefulness for atmospheric tracer transport modeling.
arXiv Detail & Related papers (2024-08-20T17:33:20Z) - The impact of internal variability on benchmarking deep learning climate emulators [2.3342885570554652]
Full-complexity Earth system models (ESMs) are computationally very expensive, limiting their use in exploring the climate outcomes of multiple emission pathways.<n>More efficient emulators that approximate ESMs can directly map emissions onto climate outcomes, and benchmarks are being used to evaluate their accuracy on standardized tasks and datasets.<n>We investigate a popular benchmark in datadriven climate emulation, ClimateBench, on which deep learning-based emulators are currently achieving the best performance.
arXiv Detail & Related papers (2024-08-09T18:17:17Z) - A conditional latent autoregressive recurrent model for generation and forecasting of beam dynamics in particle accelerators [46.348283638884425]
We propose a two-step unsupervised deep learning framework named as Latent Autoregressive Recurrent Model (CLARM) for learning dynamics of charged particles in accelerators.
The CLARM can generate projections at various accelerator sampling modules by capturing and decoding the latent space representation.
The results demonstrate that the generative and forecasting ability of the proposed approach is promising when tested against a variety of evaluation metrics.
arXiv Detail & Related papers (2024-03-19T22:05:17Z) - FaIRGP: A Bayesian Energy Balance Model for Surface Temperatures
Emulation [13.745581787463962]
We introduce FaIRGP, a data-driven emulator that satisfies the physical temperature response equations of an energy balance model.
We show how FaIRGP can be used to obtain estimates of top-of-atmosphere radiative forcing.
We hope that this work will contribute to widening the adoption of data-driven methods in climate emulation.
arXiv Detail & Related papers (2023-07-14T08:43:36Z) - ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation [45.201929285600606]
We present ClimSim-Online, which includes an end-to-end workflow for developing hybrid ML-physics simulators.
The dataset is global and spans ten years at a high sampling frequency.
We provide a cross-platform, containerized pipeline to integrate ML models into operational climate simulators.
arXiv Detail & Related papers (2023-06-14T21:26:31Z) - Towards Learned Emulation of Interannual Water Isotopologue Variations
in General Circulation Models [2.161227459325287]
We investigate the possibility to replace the explicit physics-based simulation of oxygen isotopic composition in precipitation using machine learning methods.
We implement convolutional neural networks (CNNs) based on the successful UNet architecture and test whether a spherical network architecture outperforms the naive approach of treating Earth's latitude-longitude grid as a flat image.
arXiv Detail & Related papers (2023-01-31T07:54:52Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - DeepClimGAN: A High-Resolution Climate Data Generator [60.59639064716545]
Earth system models (ESMs) are often used to generate future projections of climate change scenarios.
As a compromise, emulators are substantially less expensive but may not have all of the complexity of an ESM.
Here we demonstrate the use of a conditional generative adversarial network (GAN) to act as an ESM emulator.
arXiv Detail & Related papers (2020-11-23T20:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.