AtmosMJ: Revisiting Gating Mechanism for AI Weather Forecasting Beyond the Year Scale
- URL: http://arxiv.org/abs/2506.09733v2
- Date: Wed, 06 Aug 2025 03:40:48 GMT
- Title: AtmosMJ: Revisiting Gating Mechanism for AI Weather Forecasting Beyond the Year Scale
- Authors: Minjong Cheon,
- Abstract summary: We introduce a deep convolutional network that operates directly on ERA5 data without any spherical remapping.<n>Our results demonstrate that AtmosMJ produces stable and physically plausible forecasts for about 500 days.
- Score: 4.8951183832371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of Large Weather Models (LWMs) has marked a turning point in data-driven forecasting, with many models now outperforming traditional numerical systems in the medium range. However, achieving stable, long-range autoregressive forecasts beyond a few weeks remains a significant challenge. Prevailing state-of-the-art models that achieve year-long stability, such as SFNO and DLWP-HPX, have relied on transforming input data onto non-standard spatial domains like spherical harmonics or HEALPix meshes. This has led to the prevailing assumption that such representations are necessary to enforce physical consistency and long-term stability. This paper challenges that assumption by investigating whether comparable long-range performance can be achieved on the standard latitude-longitude grid. We introduce AtmosMJ, a deep convolutional network that operates directly on ERA5 data without any spherical remapping. The model's stability is enabled by a novel Gated Residual Fusion (GRF) mechanism, which adaptively moderates feature updates to prevent error accumulation over long recursive simulations. Our results demonstrate that AtmosMJ produces stable and physically plausible forecasts for about 500 days. In quantitative evaluations, it achieves competitive 10-day forecast accuracy against models like Pangu-Weather and GraphCast, all while requiring a remarkably low training budget of 5.7 days on a V100 GPU. Our findings suggest that efficient architectural design, rather than non-standard data representation, can be the key to unlocking stable and computationally efficient long-range weather prediction.
Related papers
- Exploring Design Choices for Autoregressive Deep Learning Climate Models [2.401696775092447]
This study quantitatively compares the long-term stability of three prominent DL-MWP architectures trained on ERA5 reanalysis data at 5.625deg resolution.<n>We identify configurations that enable stable 10-year rollouts while preserving the statistical properties of the reference dataset.
arXiv Detail & Related papers (2025-05-05T09:37:58Z) - Appa: Bending Weather Dynamics with Latent Diffusion Models for Global Data Assimilation [4.430758443755128]
Appa is a score-based data assimilation model producing global atmospheric trajectories at 0.25-degree resolution and 1-hour intervals.<n>Our results establish latent score-based data assimilation as a promising foundation for future global atmospheric modeling systems.
arXiv Detail & Related papers (2025-04-25T22:14:29Z) - How far are today's time-series models from real-world weather forecasting applications? [22.68937280154092]
WEATHER-5K is a comprehensive collection of observational weather data that better reflects real-world scenarios.
It enables a better training of models and a more accurate assessment of the real-world forecasting capabilities of TSF models.
We provide researchers with a clear assessment of the gap between academic TSF models and real-world weather forecasting applications.
arXiv Detail & Related papers (2024-06-20T15:18:52Z) - Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling [55.13352174687475]
This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which generalizes weather forecasts to finer-grained temporal scales beyond training dataset.<n>Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale.<n>We also introduce a lead time-aware training framework to promote the generalization of the model at different lead times.
arXiv Detail & Related papers (2024-05-22T16:21:02Z) - CaFA: Global Weather Forecasting with Factorized Attention on Sphere [7.687215328455751]
We propose a factorized-attention-based model tailored for spherical geometries to mitigate this issue.
The deterministic forecasting accuracy of the proposed model on $1.5circ$ and 0-7 days' lead time is on par with state-of-the-art purely data-driven machine learning weather prediction models.
arXiv Detail & Related papers (2024-05-12T23:18:14Z) - FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting [16.562512279873577]
We introduce FuXi-ENS, an advanced ML model designed to deliver 6-hourly global ensemble weather forecasts up to 15 days.
FuXi-ENS runs at a significantly increased spatial resolution of 0.25textdegree, incorporating 5 atmospheric variables at 13 pressure levels, along with 13 surface variables.
Results demonstrate that FuXi-ENS outperforms ensemble forecasts from the ECMWF, a world leading NWP model, in the CRPS of 98.1% of 360 variable and forecast lead time combinations.
arXiv Detail & Related papers (2024-05-09T17:15:09Z) - Weather Prediction with Diffusion Guided by Realistic Forecast Processes [49.07556359513563]
We introduce a novel method that applies diffusion models (DM) for weather forecasting.
Our method can achieve both direct and iterative forecasting with the same modeling framework.
The flexibility and controllability of our model empowers a more trustworthy DL system for the general weather community.
arXiv Detail & Related papers (2024-02-06T21:28:42Z) - FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather
Forecasting [56.73502043159699]
This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09$circ$ horizontal resolution.
It introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a low-resolution model.
The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES.
arXiv Detail & Related papers (2024-01-28T13:23:25Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh [3.2785715577154595]
We present a parsimonious deep learning weather prediction model to forecast seven atmospheric variables with 3-h time resolution for up to one-year lead times on a 110-km global mesh.
In comparison to state-of-the-art (SOTA) machine learning (ML) weather forecast models, such as Pangu-Weather and GraphCast, our DLWP-HPX model uses coarser resolution and far fewer prognostic variables.
arXiv Detail & Related papers (2023-09-11T16:25:48Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Improving data-driven global weather prediction using deep convolutional
neural networks on a cubed sphere [7.918783985810551]
We present a significantly-improved data-driven global weather forecasting framework using a deep convolutional neural network (CNN)
New developments in this framework include an offline volume-conservative mapping to a cubed-sphere grid.
Our model is able to learn to forecast complex surface temperature patterns from few input atmospheric state variables.
arXiv Detail & Related papers (2020-03-15T19:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.