Related papers: DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

URL: http://arxiv.org/abs/2512.00252v1
Date: Sat, 29 Nov 2025 00:02:45 GMT
Title: DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants
Authors: Martin Andrae, Erik Larsson, So Takao, Tomas Landelius, Fredrik Lindsten,
Abstract summary: We introduce DAISI, a scalable filtering algorithm built on flow-based generative models.<n>We show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations.
Score: 12.587156528707796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical DA methods, such as the ensemble Kalman filter, rely on Gaussian approximations and heuristic tuning (e.g., inflation and localization) to scale to high dimensions. While often successful, these approximations can make the methods unstable or inaccurate when the underlying distributions of states and observations depart significantly from Gaussianity. To address this limitation, we introduce DAISI, a scalable filtering algorithm built on flow-based generative models that enables flexible probabilistic inference using data-driven priors. The core idea is to use a stationary, pre-trained generative prior to assimilate observations via guidance-based conditional sampling while incorporating forecast information through a novel inverse-sampling step. This step maps the forecast ensemble into a latent space to provide initial conditions for the conditional sampling, allowing us to encode model dynamics into the DA pipeline without having to retrain or fine-tune the generative prior at each assimilation step. Experiments on challenging nonlinear systems show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations where traditional methods struggle.

Related papers

Prequential posteriors [2.831395148295604]
We introduce prequential posteriors, based upon a predictive-sequential (prequential) loss function.<n>We prove that, under mild conditions, both the prequential loss minimizer and the prequential posterior concentrate around parameters with optimal predictive performance.<n>We validate our method on both a synthetic multi-dimensional time series and a real-world meteorological dataset.
arXiv Detail & Related papers (2025-11-21T19:18:19Z)
PnP-DA: Towards Principled Plug-and-Play Integration of Variational Data Assimilation and Generative Models [0.1052166918701117]
Earth system modeling presents a fundamental challenge in scientific computing.<n>Even the most powerful AI- or physics-based forecast system suffer from gradual error accumulation.<n>We propose a Plug-and-Play algorithm that alternates a lightweight, gradient-based analysis update with a single forward pass through a pretrained prior conditioned on the background forecast.
arXiv Detail & Related papers (2025-08-01T05:19:19Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching [9.465542901469815]
Conditional Guided Flow Matching (CGFM) is a model-agnostic framework that extends flow matching by integrating outputs from an auxiliary predictive model.<n>CGFM incorporates historical data as both conditions and guidance, uses two-sided conditional paths, and employs affine paths to expand the path space.<n> Experiments across datasets and baselines show CGFM consistently outperforms state-of-the-art models, advancing forecasting.
arXiv Detail & Related papers (2025-07-09T18:03:31Z)
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models.<n>We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z)
Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z)
Exact nonlinear state estimation [0.0]
The majority of data assimilation methods in the geosciences are based on Gaussian assumptions.<n>Non-parametric, particle-based DA algorithms have superior accuracy, but their application to high-dimensional models still poses operational challenges.<n>This article introduces a new nonlinear estimation theory which attempts to bridge the existing gap in DA methodology.
arXiv Detail & Related papers (2023-10-17T03:44:29Z)
Bayesian Gaussian Process ODEs via Double Normalizing Flows [27.015257976208737]
We introduce normalizing flows to re parameterize the ODE vector field, resulting in a data-driven prior distribution.<n>We also apply normalizing flows to the posterior inference of GP ODEs to resolve the issue of strong mean-field assumptions.<n>We validate the effectiveness of our approach on simulated dynamical systems and real-world human motion data.
arXiv Detail & Related papers (2023-09-17T09:28:47Z)
Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent [43.097493761380186]
gradient algorithms are an efficient method of approximately solving linear systems. We show that gradient descent produces accurate predictions, even in cases where it does not converge quickly to the optimum. Experimentally, gradient descent achieves state-of-the-art performance on sufficiently large-scale or ill-conditioned regression tasks.
arXiv Detail & Related papers (2023-06-20T15:07:37Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers [90.45898746733397]
We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling. We show that one step along the probability flow ODE can be expressed as two steps: 1) a restoration step that runs ascent on the conditional log-likelihood at some infinitesimally previous time, and 2) a degradation step that runs the forward process using noise pointing back towards the current gradient.
arXiv Detail & Related papers (2023-03-06T18:59:19Z)
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior [103.00403682863427]
We propose PriorGrad to improve the efficiency of the conditional diffusion model. We show that PriorGrad achieves a faster convergence leading to data and parameter efficiency and improved quality.
arXiv Detail & Related papers (2021-06-11T14:04:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.