Neural variational Data Assimilation with Uncertainty Quantification using SPDE priors
- URL: http://arxiv.org/abs/2402.01855v3
- Date: Tue, 28 Jan 2025 20:47:23 GMT
- Title: Neural variational Data Assimilation with Uncertainty Quantification using SPDE priors
- Authors: Maxime Beauchamp, Ronan Fablet, Simon Benaichouche, Pierre Tandeo, Nicolas Desassis, Bertrand Chapron,
- Abstract summary: Recent advances in the deep learning community enables to address the problem through a neural architecture a variational data assimilation framework.<n>In this work we use the theory of Partial Differential Equations (SPDE) and Gaussian Processes (GP) to estimate both space-and time covariance of the state.
- Score: 28.804041716140194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The spatio-temporal interpolation of large geophysical datasets has historically been addressed by Optimal Interpolation (OI) and more sophisticated equation-based or data-driven Data Assimilation (DA) techniques. Recent advances in the deep learning community enables to address the interpolation problem through a neural architecture incorporating a variational data assimilation framework. The reconstruction task is seen as a joint learning problem of the prior involved in the variational inner cost, seen as a projection operator of the state, and the gradient-based minimization of the latter. Both prior models and solvers are stated as neural networks with automatic differentiation which can be trained by minimizing a loss function, typically the mean squared error between some ground truth and the reconstruction. Such a strategy turns out to be very efficient to improve the mean state estimation, but still needs complementary developments to quantify its related uncertainty. In this work, we use the theory of Stochastic Partial Differential Equations (SPDE) and Gaussian Processes (GP) to estimate both space-and time-varying covariance of the state. Our neural variational scheme is modified to embed an augmented state formulation with both state and SPDE parametrization to estimate. We demonstrate the potential of the proposed framework on a spatio-temporal GP driven by diffusion-based anisotropies and on realistic Sea Surface Height (SSH) datasets. We show how our solution reaches the OI baseline in the Gaussian case. For nonlinear dynamics, as almost always stated in DA, our solution outperforms OI, while allowing for fast and interpretable online parameter estimation.
Related papers
- Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.
Non-smooth regularization is often incorporated into machine learning tasks.
We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations [0.4935512063616847]
We study neural differential-algebraic systems of equations (DAEs), where some unknown relationships are learned from data.
We apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem.
We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings.
arXiv Detail & Related papers (2025-04-07T01:26:55Z) - Interpretable Deep Regression Models with Interval-Censored Failure Time Data [1.2993568435938014]
Deep learning methods for interval-censored data remain underexplored and limited to specific data type or model.
This work proposes a general regression framework for interval-censored data with a broad class of partially linear transformation models.
Applying our method to the Alzheimer's Disease Neuroimaging Initiative dataset yields novel insights and improved predictive performance compared to traditional approaches.
arXiv Detail & Related papers (2025-03-25T15:27:32Z) - Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems [49.819436680336786]
We propose an efficient transformed Gaussian process state-space model (ETGPSSM) for scalable and flexible modeling of high-dimensional, non-stationary dynamical systems.
Specifically, our ETGPSSM integrates a single shared GP with input-dependent normalizing flows, yielding an expressive implicit process prior that captures complex, non-stationary transition dynamics.
Our ETGPSSM outperforms existing GPSSMs and neural network-based SSMs in terms of computational efficiency and accuracy.
arXiv Detail & Related papers (2025-03-24T03:19:45Z) - Optimal Transport-Based Displacement Interpolation with Data Augmentation for Reduced Order Modeling of Nonlinear Dynamical Systems [0.0]
We present a novel reduced-order Model (ROM) that exploits optimal transport theory and displacement to enhance the representation of nonlinear dynamics in complex systems.
We show improved accuracy and efficiency in predicting complex system behaviors, indicating the potential of this approach for a wide range of applications in computational physics and engineering.
arXiv Detail & Related papers (2024-11-13T16:29:33Z) - Variational Neural Stochastic Differential Equations with Change Points [4.692174333076032]
We explore modeling change points in time-series data using neural differential equations (neural SDEs)
We propose a novel model formulation and training procedure based on the variational autoencoder (VAE) framework for modeling time-series as a neural SDE.
We present an empirical evaluation that demonstrates the expressive power of our proposed model, showing that it can effectively model both classical parametric SDEs and some real datasets with distribution shifts.
arXiv Detail & Related papers (2024-11-01T14:46:17Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - PDE+: Enhancing Generalization via PDE with Adaptive Distributional
Diffusion [66.95761172711073]
generalization of neural networks is a central challenge in machine learning.
We propose to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data.
We put this theoretical framework into practice as $textbfPDE+$ ($textbfPDE$ with $textbfA$daptive $textbfD$istributional $textbfD$iffusion)
arXiv Detail & Related papers (2023-05-25T08:23:26Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Variational EP with Probabilistic Backpropagation for Bayesian Neural
Networks [0.0]
I propose a novel approach for nonlinear Logistic regression using a two-layer neural network (NN) model structure with hierarchical priors on the network weights.
I derive a computationally efficient algorithm, whose complexity scales similarly to an ensemble of independent sparse logistic models.
arXiv Detail & Related papers (2023-03-02T19:09:47Z) - Deep Learning Aided Laplace Based Bayesian Inference for Epidemiological
Systems [2.596903831934905]
We propose a hybrid approach where Laplace-based Bayesian inference is combined with an ANN architecture for obtaining approximations to the ODE trajectories.
The effectiveness of our proposed methods is demonstrated using an epidemiological system with non-analytical solutions, the Susceptible-Infectious-Removed (SIR) model for infectious diseases.
arXiv Detail & Related papers (2022-10-17T09:02:41Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Probabilistic partition of unity networks: clustering based deep
approximation [0.0]
Partition of unity networks (POU-Nets) have been shown capable of realizing algebraic convergence rates for regression and solution of PDEs.
We enrich POU-Nets with a Gaussian noise model to obtain a probabilistic generalization amenable to gradient-based generalizations of a maximum likelihood loss.
We provide benchmarks quantifying performance in high/low-dimensions, demonstrating that convergence rates depend only on the latent dimension of data within high-dimensional space.
arXiv Detail & Related papers (2021-07-07T08:02:00Z) - Influence Estimation and Maximization via Neural Mean-Field Dynamics [60.91291234832546]
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems.
Our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities.
arXiv Detail & Related papers (2021-06-03T00:02:05Z) - Score-Based Generative Modeling through Stochastic Differential
Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise.
A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise.
By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks.
We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.