Related papers: Jacobian-Enforced Neural Networks (JENN) for Improved Data Assimilation Consistency in Dynamical Models

Jacobian-Enforced Neural Networks (JENN) for Improved Data Assimilation Consistency in Dynamical Models

URL: http://arxiv.org/abs/2412.01013v1
Date: Mon, 02 Dec 2024 00:12:51 GMT
Title: Jacobian-Enforced Neural Networks (JENN) for Improved Data Assimilation Consistency in Dynamical Models
Authors: Xiaoxu Tian,
Abstract summary: Machine learning-based weather models have shown great promise in producing accurate forecasts but have struggled when applied to data assimilation tasks.<n>This study introduces the Jacobian-Enforced Neural Network (JENN) framework, designed to enhance DA consistency in neural network (NN)-emulated dynamical systems.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine learning-based weather models have shown great promise in producing accurate forecasts but have struggled when applied to data assimilation tasks, unlike traditional numerical weather prediction (NWP) models. This study introduces the Jacobian-Enforced Neural Network (JENN) framework, designed to enhance DA consistency in neural network (NN)-emulated dynamical systems. Using the Lorenz 96 model as an example, the approach demonstrates improved applicability of NNs in DA through explicit enforcement of Jacobian relationships. The NN architecture includes an input layer of 40 neurons, two hidden layers with 256 units each employing hyperbolic tangent activation functions, and an output layer of 40 neurons without activation. The JENN framework employs a two-step training process: an initial phase using standard prediction-label pairs to establish baseline forecast capability, followed by a secondary phase incorporating a customized loss function to enforce accurate Jacobian relationships. This loss function combines root mean square error (RMSE) between predicted and true state values with additional RMSE terms for tangent linear (TL) and adjoint (AD) emulation results, weighted to balance forecast accuracy and Jacobian sensitivity. To ensure consistency, the secondary training phase uses additional pairs of TL/AD inputs and labels calculated from the physical models. Notably, this approach does not require starting from scratch or structural modifications to the NN, making it readily applicable to pretrained models such as GraphCast, NeuralGCM, Pangu, or FuXi, facilitating their adaptation for DA tasks with minimal reconfiguration. Experimental results demonstrate that the JENN framework preserves nonlinear forecast performance while significantly reducing noise in the TL and AD components, as well as in the overall Jacobian matrix.

Related papers

Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments. Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders. We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z)
Finite Element Neural Network Interpolation. Part I: Interpretable and Adaptive Discretization for Solving PDEs [44.99833362998488]
We present a sparse neural network architecture extending previous work on Embedded Finite Element Neural Networks (EFENN)<n>Due to their mesh-based structure, EFENN requires significantly fewer trainable parameters than fully connected neural networks.<n>Our FENNI framework, within the EFENN framework, brings improvements to the HiDeNN approach.
arXiv Detail & Related papers (2024-12-07T18:31:17Z)
Positional Encoder Graph Quantile Neural Networks for Geographic Data [4.277516034244117]
We introduce the Positional Graph Quantile Neural Network (PE-GQNN), a novel method that integrates PE-GNNs, Quantile Neural Networks, and recalibration techniques in a fully nonparametric framework. Experiments on benchmark datasets demonstrate that PE-GQNN significantly outperforms existing state-of-the-art methods in both predictive accuracy and uncertainty quantification.
arXiv Detail & Related papers (2024-09-27T16:02:12Z)
Empowering Bayesian Neural Networks with Functional Priors through Anchored Ensembling for Mechanics Surrogate Modeling Applications [0.0]
We present a novel BNN training scheme based on anchored ensembling that can integrate a priori information available in the function space. The anchoring scheme makes use of low-rank correlations between NN parameters, learnt from pre-training to realizations of the functional prior. We also perform a study to demonstrate how correlations between NN weights, which are often neglected in existing BNN implementations, is critical to appropriately transfer knowledge between the function-space and parameter-space priors.
arXiv Detail & Related papers (2024-09-08T22:27:50Z)
Bayesian Entropy Neural Networks for Physics-Aware Prediction [14.705526856205454]
We introduce BENN, a framework designed to impose constraints on Bayesian Neural Network (BNN) predictions. Benn is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output. Results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods.
arXiv Detail & Related papers (2024-07-01T07:00:44Z)
Interpretable A-posteriori Error Indication for Graph Neural Network Surrogate Models [0.0]
This work introduces an interpretability enhancement procedure for graph neural networks (GNNs) The end result is an interpretable GNN model that isolates regions in physical space, corresponding to sub-graphs, that are intrinsically linked to the forecasting task. The interpretable GNNs can also be used to identify, during inference, graph nodes that correspond to a majority of the anticipated forecasting error.
arXiv Detail & Related papers (2023-11-13T18:37:07Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture [0.0]
This work presents a two-stage adaptive framework for developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm.
arXiv Detail & Related papers (2022-11-13T09:51:16Z)
On the adaptation of recurrent neural networks for system identification [2.5234156040689237]
This paper presents a transfer learning approach which enables fast and efficient adaptation of Recurrent Neural Network (RNN) models of dynamical systems. The system dynamics are then assumed to change, leading to an unacceptable degradation of the nominal model performance on the perturbed system. To cope with the mismatch, the model is augmented with an additive correction term trained on fresh data from the new dynamic regime.
arXiv Detail & Related papers (2022-01-21T12:04:17Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Entropy-Based Modeling for Estimating Soft Errors Impact on Binarized Neural Network Inference [2.249916681499244]
We present the relatively-accurate statistical models to delineate the impact of both undertaken single-event upset (SEU) and multi-bit upset (MBU) across layers and per each layer of the selected convolution neural network. These models can be used for evaluating the error-resiliency magnitude of NN topology before adopting them in the safety-critical applications.
arXiv Detail & Related papers (2020-04-10T16:10:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.