Related papers: Retrofitting Earth System Models with Cadence-Limited Neural Operator Updates

Retrofitting Earth System Models with Cadence-Limited Neural Operator Updates

URL: http://arxiv.org/abs/2512.03309v1
Date: Tue, 02 Dec 2025 23:44:49 GMT
Title: Retrofitting Earth System Models with Cadence-Limited Neural Operator Updates
Authors: Aniruddha Bora, Shixuan Zhang, Khemraj Shukla, Bryce Harrop, George Em. Karniadakis, L. Ruby Leung,
Abstract summary: We introduce an operator-learning framework that maps instantaneous model states to bias-correction tendencies.<n>We train two years E3SM simulations nudged toward ERA5 reanalysis, and the operators generalize across height levels and seasons.<n>Our framework emphasizes long-term stability, portability, and cadence-limited updates, demonstrating the utility of expressive ML operators.
Score: 3.9578288463123
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Coarse resolution, imperfect parameterizations, and uncertain initial states and forcings limit Earth-system model (ESM) predictions. Traditional bias correction via data assimilation improves constrained simulations but offers limited benefit once models run freely. We introduce an operator-learning framework that maps instantaneous model states to bias-correction tendencies and applies them online during integration. Building on a U-Net backbone, we develop two operator architectures Inception U-Net (IUNet) and a multi-scale network (M\&M) that combine diverse upsampling and receptive fields to capture multiscale nonlinear features under Energy Exascale Earth System Model (E3SM) runtime constraints. Trained on two years E3SM simulations nudged toward ERA5 reanalysis, the operators generalize across height levels and seasons. Both architectures outperform standard U-Net baselines in offline tests, indicating that functional richness rather than parameter count drives performance. In online hybrid E3SM runs, M\&M delivers the most consistent bias reductions across variables and vertical levels. The ML-augmented configurations remain stable and computationally feasible in multi-year simulations, providing a practical pathway for scalable hybrid modeling. Our framework emphasizes long-term stability, portability, and cadence-limited updates, demonstrating the utility of expressive ML operators for learning structured, cross-scale relationships and retrofitting legacy ESMs.

Related papers

Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning [70.56067503630486]
We argue that sixth-generation (6G) intelligence is not fluent token prediction but calibrated the capacity to imagine and choose.<n>We show that WM-MS3M cuts mean absolute error (MAE) by 1.69% versus MS3M with 32% fewer parameters and similar latency, and achieves 35-80% lower root mean squared error (RMSE) than attention/hybrid baselines with 2.3-4.1x faster inference.
arXiv Detail & Related papers (2025-11-04T17:22:22Z)
URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model [76.08429266631823]
We propose an end-to-end automatic reconstruction framework based on a 3D multimodal large language model (MLLM)<n>URDF-Anything utilizes an autoregressive prediction framework based on point-cloud and text multimodal input to jointly optimize geometric segmentation and kinematic parameter prediction.<n> Experiments on both simulated and real-world datasets demonstrate that our method significantly outperforms existing approaches.
arXiv Detail & Related papers (2025-11-02T13:45:51Z)
A Mixture of Experts Gating Network for Enhanced Surrogate Modeling in External Aerodynamics [0.28647133890966997]
Mixture of Experts (MoE) model combines predictions from three heterogeneous, state-of-the-art surrogate models.<n>The entire system is trained and validated on the DrivAerML dataset, a large-scale, public benchmark of high-fidelity CFD simulations for automotive aerodynamics.
arXiv Detail & Related papers (2025-08-28T22:34:10Z)
World Model-Based Learning for Long-Term Age of Information Minimization in Vehicular Networks [53.98633183204453]
In this paper, a novel world model-based learning framework is proposed to minimize packet-completeness-aware age of information (CAoI) in a vehicular network.<n>A world model framework is proposed to jointly learn a dynamic model of the mmWave V2X environment and use it to imagine trajectories for learning how to perform link scheduling.<n>In particular, the long-term policy is learned in differentiable imagined trajectories instead of environment interactions.
arXiv Detail & Related papers (2025-05-03T06:23:18Z)
Time Marching Neural Operator FE Coupling: AI Accelerated Physics Modeling [3.0635300721402228]
This work introduces a novel hybrid framework that integrates physics-informed deep operator network with FEM through domain decomposition.<n>To address the challenges of dynamic systems, we embed a time stepping scheme directly into the DeepONet, substantially reducing long-term error propagation.<n>Our framework shows accelerated convergence rates (up to 20% improvement in convergence rates compared to conventional FE coupling approaches) while preserving solution fidelity with error margins consistently below 3%.
arXiv Detail & Related papers (2025-04-15T16:54:04Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [86.76714527437383]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
Longhorn: State Space Models are Amortized Online Learners [51.10124201221601]
State-space models (SSMs) offer linear decoding efficiency while maintaining parallelism during training. In this work, we explore SSM design through the lens of online learning, conceptualizing SSMs as meta-modules for specific online learning problems. We introduce a novel deep SSM architecture, Longhorn, whose update resembles the closed-form solution for solving the online associative recall problem.
arXiv Detail & Related papers (2024-07-19T11:12:08Z)
Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning. As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers. We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z)
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget [20.33693233516486]
We introduce SwapMoE, a framework for efficient serving of MoE-based large language models with tunable memory budgets. experiments have shown that SwapMoE can reduce the memory footprint while maintaining reasonable accuracy.
arXiv Detail & Related papers (2023-08-29T05:25:21Z)
Asynchronous Multi-Model Dynamic Federated Learning over Wireless Networks: Theory, Modeling, and Optimization [20.741776617129208]
Federated learning (FL) has emerged as a key technique for distributed machine learning (ML) We first formulate rectangular scheduling steps and functions to capture the impact of system parameters on learning performance. Our analysis sheds light on the joint impact of device training variables and asynchronous scheduling decisions.
arXiv Detail & Related papers (2023-05-22T21:39:38Z)
ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting [1.4502611532302039]
Short-term load forecasting (STLF) is challenging due to complex time series (TS) This paper proposes a novel hybrid hierarchical deep learning model that deals with multiple seasonality. It combines exponential smoothing (ES) and a recurrent neural network (RNN)
arXiv Detail & Related papers (2021-12-05T19:38:42Z)
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization [60.73540999409032]
We show that expressive autoregressive dynamics models generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We also show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer.
arXiv Detail & Related papers (2021-04-28T16:48:44Z)
A Hybrid Residual Dilated LSTM end Exponential Smoothing Model for Mid-Term Electric Load Forecasting [1.1602089225841632]
The model combines exponential smoothing (ETS), advanced Long Short-Term Memory (LSTM) and ensembling. A simulation study performed on the monthly electricity demand time series for 35 European countries confirmed the high performance of the proposed model.
arXiv Detail & Related papers (2020-03-29T10:53:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.