Challenges of learning multi-scale dynamics with AI weather models: Implications for stability and one solution
- URL: http://arxiv.org/abs/2304.07029v2
- Date: Sat, 07 Dec 2024 10:52:38 GMT
- Title: Challenges of learning multi-scale dynamics with AI weather models: Implications for stability and one solution
- Authors: Ashesh Chattopadhyay, Y. Qiang Sun, Pedram Hassanzadeh,
- Abstract summary: Current AI-based weather models can only provide short-term forecasts accurately when time-integrated beyond a few weeks or a few months.<n>The cause of the instabilities is unknown, and the methods that are used to improve their stability horizons are ad-hoc and lack rigorous theory.<n>We develop long-term physically-consistent data-driven models for the climate system and demonstrate accurate short-term forecasts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Long-term stability and physical consistency are critical properties for AI-based weather models if they are going to be used for subseasonal-to-seasonal forecasts or beyond, e.g., climate change projection. However, current AI-based weather models can only provide short-term forecasts accurately since they become unstable or physically inconsistent when time-integrated beyond a few weeks or a few months. Either they exhibit numerical blow-up or hallucinate unrealistic dynamics of the atmospheric variables, akin to the current class of autoregressive large language models. The cause of the instabilities is unknown, and the methods that are used to improve their stability horizons are ad-hoc and lack rigorous theory. In this paper, we reveal that the universal causal mechanism for these instabilities in any turbulent flow is due to \textit{spectral bias} wherein, \textit{any} deep learning architecture is biased to learn only the large-scale dynamics and ignores the small scales completely. We further elucidate how turbulence physics and the absence of convergence in deep learning-based time-integrators amplify this bias, leading to unstable error propagation. Finally, using the quasi-geostrophic flow and European Center for Medium-Range Weather Forecasting (ECMWF) Reanalysis data as test cases, we bridge the gap between deep learning theory and numerical analysis to propose one mitigative solution to such unphysical behavior. We develop long-term physically-consistent data-driven models for the climate system and demonstrate accurate short-term forecasts, and hundreds of years of time-integration with accurate mean and variability.
Related papers
- A Generative Framework for Probabilistic, Spatiotemporally Coherent Downscaling of Climate Simulation [23.504915709396204]
We present a novel generative framework that uses a score-based diffusion model trained on high-resolution reanalysis data to capture the statistical properties of local weather dynamics.
We demonstrate that the model generates spatially and temporally coherent weather dynamics that align with global climate output.
arXiv Detail & Related papers (2024-12-19T19:47:35Z) - Deep End-to-End Survival Analysis with Temporal Consistency [49.77103348208835]
We present a novel Survival Analysis algorithm designed to efficiently handle large-scale longitudinal data.
A central idea in our method is temporal consistency, a hypothesis that past and future outcomes in the data evolve smoothly over time.
Our framework uniquely incorporates temporal consistency into large datasets by providing a stable training signal.
arXiv Detail & Related papers (2024-10-09T11:37:09Z) - Mitigating Time Discretization Challenges with WeatherODE: A Sandwich Physics-Driven Neural ODE for Weather Forecasting [20.135470301151727]
We present WeatherODE, a novel one-stage, physics-driven ordinary differential equation (ODE) model designed to enhance weather forecasting accuracy.
By leveraging wave equation theory and integrating a time-dependent source model, WeatherODE effectively addresses the challenges associated with time-discretization error and dynamic atmospheric processes.
WeatherODE demonstrates superior performance in both global and regional weather forecasting tasks, outperforming recent state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-09T05:41:24Z) - A probabilistic framework for learning non-intrusive corrections to long-time climate simulations from short-time training data [12.566163525039558]
We present a strategy for training neural network models to non-intrusively correct under-resolved long-time simulations of chaotic systems.
We demonstrate its ability to accurately predict the anisotropic statistics over time horizons more than 30 times longer than the data seen in training.
arXiv Detail & Related papers (2024-08-02T18:34:30Z) - Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling [55.13352174687475]
This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which generalizes weather forecasts to finer-grained temporal scales beyond training dataset.
Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale.
We also introduce a lead time-aware training framework to promote the generalization of the model at different lead times.
arXiv Detail & Related papers (2024-05-22T16:21:02Z) - ClimODE: Climate and Weather Forecasting with Physics-informed Neural ODEs [14.095897879222676]
We present ClimODE, a continuous-time process that implements key principle of statistical mechanics.
ClimODE models precise weather evolution with value-conserving dynamics, learning global weather transport as a neural flow.
Our approach outperforms existing data-driven methods in global, regional forecasting with an order of magnitude smaller parameterization.
arXiv Detail & Related papers (2024-04-15T06:38:21Z) - Weather Prediction with Diffusion Guided by Realistic Forecast Processes [49.07556359513563]
We introduce a novel method that applies diffusion models (DM) for weather forecasting.
Our method can achieve both direct and iterative forecasting with the same modeling framework.
The flexibility and controllability of our model empowers a more trustworthy DL system for the general weather community.
arXiv Detail & Related papers (2024-02-06T21:28:42Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs.
Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative.
The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z) - Dynamical Tests of a Deep-Learning Weather Prediction Model [0.0]
Deep-learning weather prediction models have been shown to produce forecasts that rival those from physics-based models run at operational centers.
It is unclear whether these models have encoded atmospheric dynamics, or simply pattern matching that produces the smallest forecast error.
Here we subject one such model, Pangu-weather, to a set of four classical dynamical experiments that do not resemble the model training data.
We conclude that the model encodes realistic physics in all experiments, and suggest it can be used as a tool for rapidly testing ideas before using expensive physics-based models.
arXiv Detail & Related papers (2023-09-19T18:26:41Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - Benchmarking Autoregressive Conditional Diffusion Models for Turbulent
Flow Simulation [29.806100463356906]
We analyze if fully data-driven fluid solvers that utilize an autoregressive rollout based on conditional diffusion models are a viable option.
We investigate accuracy, posterior sampling, spectral behavior, and temporal stability, while requiring that methods generalize to flow parameters beyond the training regime.
We find that even simple diffusion-based approaches can outperform multiple established flow prediction methods in terms of accuracy and temporal stability, while being on par with state-of-the-art stabilization techniques like unrolling at training time.
arXiv Detail & Related papers (2023-09-04T18:01:42Z) - Discovering Predictable Latent Factors for Time Series Forecasting [39.08011991308137]
We develop a novel framework for inferring the intrinsic latent factors implied by the observable time series.
We introduce three characteristics, i.e., predictability, sufficiency, and identifiability, and model these characteristics via the powerful deep latent dynamics models.
Empirical results on multiple real datasets show the efficiency of our method for different kinds of time series forecasting.
arXiv Detail & Related papers (2023-03-18T14:37:37Z) - Long-term stability and generalization of observationally-constrained
stochastic data-driven models for geophysical turbulence [0.19686770963118383]
Deep learning models can mitigate certain biases in current state-of-the-art weather models.
Data-driven models require a lot of training data which may not be available from reanalysis (observational data) products.
deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift.
We propose a convolutional variational autoencoder-based data-driven model that is pre-trained on an imperfect climate model simulation.
arXiv Detail & Related papers (2022-05-09T23:52:37Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Physics-aware, probabilistic model order reduction with guaranteed
stability [0.0]
We propose a generative framework for learning an effective, lower-dimensional, coarse-grained dynamical model.
We demonstrate its efficacy and accuracy in multiscale physical systems of particle dynamics.
arXiv Detail & Related papers (2021-01-14T19:16:51Z) - From Goals, Waypoints & Paths To Long Term Human Trajectory Forecasting [54.273455592965355]
Uncertainty in future trajectories stems from two sources: (a) sources known to the agent but unknown to the model, such as long term goals and (b)sources that are unknown to both the agent & the model, such as intent of other agents & irreducible randomness indecisions.
We model the epistemic un-certainty through multimodality in long term goals and the aleatoric uncertainty through multimodality in waypoints& paths.
To exemplify this dichotomy, we also propose a novel long term trajectory forecasting setting, with prediction horizons upto a minute, an order of magnitude longer than prior works.
arXiv Detail & Related papers (2020-12-02T21:01:29Z) - Modeling Atmospheric Data and Identifying Dynamics: Temporal Data-Driven
Modeling of Air Pollutants [2.578242050187029]
We present an empirical approach using data-driven techniques to study air quality in Madrid.
We find parsimonious systems of ordinary differential equations that model the concentration of pollutants and their changes over time.
Our results show that Akaike's Information Criterion can work well in conjunction with best subset regression as to find an equilibrium between sparsity and goodness of fit.
arXiv Detail & Related papers (2020-10-13T16:46:07Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.