Making Tunable Parameters State-Dependent in Weather and Climate Models with Reinforcement Learning
- URL: http://arxiv.org/abs/2601.04268v1
- Date: Wed, 07 Jan 2026 11:19:16 GMT
- Title: Making Tunable Parameters State-Dependent in Weather and Climate Models with Reinforcement Learning
- Authors: Pritthijit Nath, Sebastian Schemm, Henry Moss, Peter Haynes, Emily Shuckburgh, Mark J. Webb,
- Abstract summary: This study presents a framework that learns components of parametrisation schemes online.<n>It evaluates the resulting RL-driven parameter updates across a hierarchy of idealised testbeds.<n>Results highlight RL to deliver skilful state-dependent, and regime-aware parametrisations.
- Score: 0.5131152350448099
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Weather and climate models rely on parametrisations to represent unresolved sub-grid processes. Traditional schemes rely on fixed coefficients that are weakly constrained and tuned offline, contributing to persistent biases that limit their ability to adapt to the underlying physics. This study presents a framework that learns components of parametrisation schemes online as a function of the evolving model state using reinforcement learning (RL) and evaluates the resulting RL-driven parameter updates across a hierarchy of idealised testbeds spanning a simple climate bias correction (SCBC), a radiative-convective equilibrium (RCE), and a zonal mean energy balance model (EBM) with both single-agent and federated multi-agent settings. Across nine RL algorithms, Truncated Quantile Critics (TQC), Deep Deterministic Policy Gradient (DDPG), and Twin Delayed DDPG (TD3) achieved the highest skill and the most stable convergence across configurations, with performance assessed against a static baseline using area-weighted RMSE, temperature profile and pressure-level diagnostics. For the EBM, single-agent RL outperformed static parameter tuning with the strongest gains in tropical and mid-latitude bands, while federated RL on multi-agent setups enabled geographically specialised control and faster convergence, with a six-agent DDPG configuration using frequent aggregation yielding the lowest area-weighted RMSE across the tropics and mid-latitudes. The learnt corrections were also physically meaningful as agents modulated EBM radiative parameters to reduce meridional biases, adjusted RCE lapse rates to match vertical temperature errors, and stabilised SCBC heating increments to limit drift. Overall, results highlight RL to deliver skilful state-dependent, and regime-aware parametrisations, offering a scalable pathway for online learning within numerical models.
Related papers
- ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations [54.886931928255564]
Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
arXiv Detail & Related papers (2026-02-07T10:19:36Z) - High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning [57.85676271833619]
Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning.<n>We present textbfSMoA, a high-rank textbfStructured textbfMOdulation textbfAdapter that uses fewer trainable parameters while maintaining a higher rank.
arXiv Detail & Related papers (2026-01-12T13:06:17Z) - Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration [56.074760766965085]
PRISM achieves a dynamics-aware framework that arbitrates data based on its degree of cognitive conflict with the model's existing knowledge.<n>Our findings suggest that disentangling data based on internal optimization regimes is crucial for scalable and robust agent alignment.
arXiv Detail & Related papers (2026-01-12T05:43:20Z) - The Path Not Taken: RLVR Provably Learns Off the Principals [85.41043469428365]
We show that sparsity is a surface artifact of a model-conditioned optimization bias.<n>We mechanistically explain these dynamics with a Three-Gate Theory.<n>We provide a parameter-level characterization of RLVR's learning dynamics.
arXiv Detail & Related papers (2025-11-11T18:49:45Z) - CO-PFL: Contribution-Oriented Personalized Federated Learning for Heterogeneous Networks [51.43780477302533]
Contribution-Oriented PFL (CO-PFL) is a novel algorithm that dynamically estimates each client's contribution for global aggregation.<n>CO-PFL consistently surpasses state-of-the-art methods in robustness in personalization accuracy, robustness, scalability and convergence stability.
arXiv Detail & Related papers (2025-10-23T05:10:06Z) - ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty [28.291179179647795]
We propose textbfAdaptive Rank Representation (AdaRL), a bi-level optimization framework that improves robustness.<n>At the lower level, AdaRL performs policy optimization under fixed-rank constraints with dynamics sampled from a Wasserstein ball around a centroid model.<n>At the upper level, it adaptively adjusts the rank to balance the bias--variance trade-off, projecting policy parameters onto a low-rank manifold.
arXiv Detail & Related papers (2025-10-13T20:05:34Z) - FedRAIN-Lite: Federated Reinforcement Algorithms for Improving Idealised Numerical Weather and Climate Models [0.5131152350448099]
Sub-grid parameterisations in climate models are traditionally static and tuned offline.<n>FedRAIN-Lite is a framework that mirrors the spatial decomposition used in general circulation models.<n>Deep Deterministic Policy Gradient consistently outperforms both static and single-agent baselines.
arXiv Detail & Related papers (2025-08-19T23:54:13Z) - Aggregation of Published Non-Uniform Axial Power Data for Phase II of the OECD/NEA AI/ML Critical Heat Flux Benchmark [0.0]
Critical heat flux (CHF) marks the onset of boiling crisis in light-water reactors.<n>This work compiles and digitizes a broad CHF dataset covering both uniform and non-uniform axial heating conditions.
arXiv Detail & Related papers (2025-06-18T16:01:44Z) - PVBF: A Framework for Mitigating Parameter Variation Imbalance in Online Continual Learning [19.18078967631654]
Online continual learning (OCL) enables AI systems to adaptively learn from non-stationary data streams.<n>This paper identifies parameter variation imbalance as a critical factor contributing to prediction bias in ER-based OCL.
arXiv Detail & Related papers (2025-02-25T02:56:42Z) - RAIN: Reinforcement Algorithms for Improving Numerical Weather and Climate Models [0.0]
Current climate models rely on complex mathematical parameterisations to represent sub-grid scale processes.<n>This study explores integrating reinforcement learning with idealised climate models to address key parameterisation challenges.
arXiv Detail & Related papers (2024-08-28T20:10:46Z) - Towards Physically Consistent Deep Learning For Climate Model Parameterizations [46.07009109585047]
parameterizations are a major source of systematic errors and large uncertainties in climate projections.
Deep learning (DL)-based parameterizations, trained on data from computationally expensive short, high-resolution simulations, have shown great promise for improving climate models.
We propose an efficient supervised learning framework for DL-based parameterizations that leads to physically consistent models.
arXiv Detail & Related papers (2024-06-06T10:02:49Z) - Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
Training [58.20089993899729]
This paper proposes TempBalance, a straightforward yet effective layerwise learning rate method.
We show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization.
We also show that TempBalance outperforms a number of state-of-the-art metrics and schedulers.
arXiv Detail & Related papers (2023-12-01T05:38:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.