Adjoint-based online learning of two-layer quasi-geostrophic baroclinic turbulence
- URL: http://arxiv.org/abs/2411.14106v1
- Date: Thu, 21 Nov 2024 13:15:01 GMT
- Title: Adjoint-based online learning of two-layer quasi-geostrophic baroclinic turbulence
- Authors: Fei Er Yan, Hugo Frezat, Julien Le Sommer, Julian Mak, Karl Otness,
- Abstract summary: An increasingly popular approach is to leverage machine learning approaches for parameterizations, regressing for a map between the resolved state and missing feedbacks in a fluid system as a supervised learning task.
Here, we explore the online' approach that involves the fluid dynamical model during the training stage for the learning of baroclinic turbulence and its parameterization.
Two online approaches are considered: a full adjoint-based online approach, related to traditional adjoint optimization approaches that require a differentiable' dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a different
- Score: 1.0985060632689176
- License:
- Abstract: For reasons of computational constraint, most global ocean circulation models used for Earth System Modeling still rely on parameterizations of sub-grid processes, and limitations in these parameterizations affect the modeled ocean circulation and impact on predictive skill. An increasingly popular approach is to leverage machine learning approaches for parameterizations, regressing for a map between the resolved state and missing feedbacks in a fluid system as a supervised learning task. However, the learning is often performed in an `offline' fashion, without involving the underlying fluid dynamical model during the training stage. Here, we explore the `online' approach that involves the fluid dynamical model during the training stage for the learning of baroclinic turbulence and its parameterization, with reference to ocean eddy parameterization. Two online approaches are considered: a full adjoint-based online approach, related to traditional adjoint optimization approaches that require a `differentiable' dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a differentiable dynamical model. The online approaches are found to be generally more skillful and numerically stable than offline approaches. Others details relating to online training, such as window size, machine learning model set up and designs of the loss functions are detailed to aid in further explorations of the online training methodology for Earth System Modeling.
Related papers
- A parametric framework for kernel-based dynamic mode decomposition using deep learning [0.0]
The proposed framework consists of two stages, offline and online.
The online stage leverages those LANDO models to generate new data at a desired time instant.
dimensionality reduction technique is applied to high-dimensional dynamical systems to reduce the computational cost of training.
arXiv Detail & Related papers (2024-09-25T11:13:50Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples.
Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially.
We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z) - Decomposing weather forecasting into advection and convection with neural networks [6.78786601630176]
We propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately.
Our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models.
arXiv Detail & Related papers (2024-05-10T16:46:32Z) - MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot
Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations.
Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains.
We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z) - Online Calibration of Deep Learning Sub-Models for Hybrid Numerical
Modeling Systems [34.50407690251862]
We present an efficient and practical online learning approach for hybrid systems.
We demonstrate that the method, called EGA for Euler Gradient Approximation, converges to the exact gradients in the limit of infinitely small time steps.
Results show significant improvements over offline learning, highlighting the potential of end-to-end online learning for hybrid modeling.
arXiv Detail & Related papers (2023-11-17T17:36:26Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - Variational Beam Search for Learning with Distribution Shifts [26.345665980534374]
We propose a new Bayesian meta-algorithm that can both (i) make inferences about subtle distribution shifts based on minimal sequential observations and (ii) accordingly adapt a model in an online fashion.
Our proposed approach is model-agnostic, applicable to both supervised and unsupervised learning, and yields significant improvements over state-of-the-art Bayesian online learning approaches.
arXiv Detail & Related papers (2020-12-15T05:28:47Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.