Related papers: Nonparametric learning of stochastic differential equations from sparse and noisy data

Nonparametric learning of stochastic differential equations from sparse and noisy data

URL: http://arxiv.org/abs/2508.11597v1
Date: Fri, 15 Aug 2025 17:01:59 GMT
Title: Nonparametric learning of stochastic differential equations from sparse and noisy data
Authors: Arnab Ganguly, Riten Mitra, Jinpu Zhou,
Abstract summary: We learn the entire drift function directly from data without strong structural assumptions.<n>We develop an Expectation-Maximization (EM) algorithm that employs a novel Sequential Monte Carlo (SMC) method.<n>The resulting EM-SMC-RKHS procedure enables accurate estimation of the drift function of dynamical systems in low-data regimes.
Score: 2.389598109913754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift, our goal here is to learn the entire drift function directly from data without strong structural assumptions, making it especially relevant in scientific disciplines where system dynamics are partially understood or highly complex. We cast the estimation problem as minimization of the penalized negative log-likelihood functional over a reproducing kernel Hilbert space (RKHS). In the sparse observation regime, the presence of unobserved trajectory segments makes the SDE likelihood intractable. To address this, we develop an Expectation-Maximization (EM) algorithm that employs a novel Sequential Monte Carlo (SMC) method to approximate the filtering distribution and generate Monte Carlo estimates of the E-step objective. The M-step then reduces to a penalized empirical risk minimization problem in the RKHS, whose minimizer is given by a finite linear combination of kernel functions via a generalized representer theorem. To control model complexity across EM iterations, we also develop a hybrid Bayesian variant of the algorithm that uses shrinkage priors to identify significant coefficients in the kernel expansion. We establish important theoretical convergence results for both the exact and approximate EM sequences. The resulting EM-SMC-RKHS procedure enables accurate estimation of the drift function of stochastic dynamical systems in low-data regimes and is broadly applicable across domains requiring continuous-time modeling under observational constraints. We demonstrate the effectiveness of our method through a series of numerical experiments.

Related papers

A joint optimization approach to identifying sparse dynamics using least squares kernel collocation [70.13783231186183]
We develop an all-at-once modeling framework for learning systems of ordinary differential equations (ODE) from scarce, partial, and noisy observations of the states.<n>The proposed methodology amounts to a combination of sparse recovery strategies for the ODE over a function library combined with techniques from reproducing kernel Hilbert space (RKHS) theory for estimating the state and discretizing the ODE.
arXiv Detail & Related papers (2025-11-23T18:04:15Z)
Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation [55.88862563823878]
In this work, we present an original algorithm to coarsen an unstructured grid based on the concepts of differentiable physics.<n>We demonstrate performance of the algorithm on two PDEs: a linear equation which governs slightly compressible fluid flow in porous media and the wave equation.<n>Our results show that in the considered scenarios, we reduced the number of grid points up to 10 times while preserving the modeled variable dynamics in the points of interest.
arXiv Detail & Related papers (2025-07-24T11:02:13Z)
A Data-Driven Framework for Discovering Fractional Differential Equations in Complex Systems [8.206685537936078]
This study introduces a stepwise data-driven framework for discovering fractional differential equations (FDEs) directly from data.<n>Our framework applies deep neural networks as surrogate models for denoising and reconstructing sparse and noisy observations.<n>We validate the framework across various datasets, including synthetic anomalous diffusion data and experimental data on the creep behavior of frozen soils.
arXiv Detail & Related papers (2024-12-05T08:38:30Z)
Momentum Particle Maximum Likelihood [2.4561590439700076]
We propose an analogous dynamical-systems-inspired approach to minimizing the free energy functional. By discretizing the system, we obtain a practical algorithm for Maximum likelihood estimation in latent variable models. The algorithm outperforms existing particle methods in numerical experiments and compares favourably with other MLE algorithms.
arXiv Detail & Related papers (2023-12-12T14:53:18Z)
Weighted Riesz Particles [0.0]
We consider the target distribution as a mapping where the infinite-dimensional space of the parameters consists of a number of deterministic submanifolds. We study the properties of the point, called Riesz, and embed it into sequential MCMC. We find that there will be higher acceptance rates with fewer evaluations.
arXiv Detail & Related papers (2023-12-01T14:36:46Z)
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation [2.365116842280503]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure.<n>In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure.<n>Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error.<n>in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z)
Score-based Diffusion Models in Function Space [137.70916238028306]
Diffusion models have recently emerged as a powerful framework for generative modeling.<n>This work introduces a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.<n>We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z)
Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.<n>We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.<n>Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z)
Rigorous dynamical mean field theory for stochastic gradient descent methods [17.90683687731009]
We prove closed-form equations for the exact high-dimensionals of a family of first order gradient-based methods. This includes widely used algorithms such as gradient descent (SGD) or Nesterov acceleration.
arXiv Detail & Related papers (2022-10-12T21:10:55Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers. We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs) We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.