Related papers: Uniform-in-time convergence bounds for Persistent Contrastive Divergence Algorithms

Uniform-in-time convergence bounds for Persistent Contrastive Divergence Algorithms

URL: http://arxiv.org/abs/2510.01944v1
Date: Thu, 02 Oct 2025 12:12:33 GMT
Title: Uniform-in-time convergence bounds for Persistent Contrastive Divergence Algorithms
Authors: Paul Felix Valsecchi Oliva, O. Deniz Akyildiz, Andrew Duncan,
Abstract summary: We propose a continuous-time formulation of persistent contrastive divergence (PCD) for maximum likelihood estimation (MLE) of unnormalised densities.<n>We are able to derive explicit bounds for the error between the PCD and the MLE solution for the model parameter.
Score: 0.29494468099506904
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We propose a continuous-time formulation of persistent contrastive divergence (PCD) for maximum likelihood estimation (MLE) of unnormalised densities. Our approach expresses PCD as a coupled, multiscale system of stochastic differential equations (SDEs), which perform optimisation of the parameter and sampling of the associated parametrised density, simultaneously. From this novel formulation, we are able to derive explicit bounds for the error between the PCD iterates and the MLE solution for the model parameter. This is made possible by deriving uniform-in-time (UiT) bounds for the difference in moments between the multiscale system and the averaged regime. An efficient implementation of the continuous-time scheme is introduced, leveraging a class of explicit, stable intregators, stochastic orthogonal Runge-Kutta Chebyshev (S-ROCK), for which we provide explicit error estimates in the long-time regime. This leads to a novel method for training energy-based models (EBMs) with explicit error guarantees.

Related papers

An Elementary Approach to Scheduling in Generative Diffusion Models [55.171367482496755]
An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed.<n> Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies.
arXiv Detail & Related papers (2026-01-20T05:06:26Z)
MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation [48.41289705783405]
We propose a PDE-embedded network with multiscale time stepping (MultiPDENet)<n>In particular, we design a convolutional filter based on the structure of finite difference with a small number of parameters to optimize.<n>A Physics Block with a 4th-order Runge-Kutta integrator at the fine time scale is established that embeds the structure of PDEs to guide the prediction.
arXiv Detail & Related papers (2025-01-27T12:15:51Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Momentum Particle Maximum Likelihood [2.4561590439700076]
We propose an analogous dynamical-systems-inspired approach to minimizing the free energy functional. By discretizing the system, we obtain a practical algorithm for Maximum likelihood estimation in latent variable models. The algorithm outperforms existing particle methods in numerical experiments and compares favourably with other MLE algorithms.
arXiv Detail & Related papers (2023-12-12T14:53:18Z)
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability [17.771354881467435]
We show that a simple algorithm with a universal and instance-independent step size is sufficient to obtain near-optimal variance and bias terms. Our proof technique is based on refined error bounds for linear approximation together with the novel stability result for the product of random matrices.
arXiv Detail & Related papers (2023-10-22T12:37:25Z)
Parallel-in-Time Probabilistic Numerical ODE Solvers [35.716255949521305]
Probabilistic numerical solvers for ordinary differential equations (ODEs) treat the numerical simulation of dynamical systems as problems of Bayesian state estimation. We build on the time-parallel formulation of iterated extended Kalman smoothers to formulate a parallel-in-time probabilistic numerical ODE solver.
arXiv Detail & Related papers (2023-10-02T12:32:21Z)
Non-Parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence [65.63201894457404]
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear differential equations.<n>The key idea essentially consists of fitting a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations.
arXiv Detail & Related papers (2023-05-24T20:43:47Z)
Self-Consistent Velocity Matching of Probability Flows [22.2542921090435]
We present a discretization-free scalable framework for solving a class of partial differential equations (PDEs) The main observation is that the time-varying velocity field of the PDE solution needs to be self-consistent. We use an iterative formulation with a biased gradient estimator that bypasses significant computational obstacles with strong empirical performance.
arXiv Detail & Related papers (2023-01-31T16:17:18Z)
Probabilistic Numerical Method of Lines for Time-Dependent Partial Differential Equations [20.86460521113266]
Current state-of-the-art PDE solvers treat the space- and time-dimensions separately, serially, and with black-box algorithms. We introduce a probabilistic version of a technique called method of lines to fix this issue. Joint quantification of space- and time-uncertainty becomes possible without losing the performance benefits of well-tuned ODE solvers.
arXiv Detail & Related papers (2021-10-22T15:26:05Z)
The Connection between Discrete- and Continuous-Time Descriptions of Gaussian Continuous Processes [60.35125735474386]
We show that discretizations yielding consistent estimators have the property of invariance under coarse-graining' This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order differential equations.
arXiv Detail & Related papers (2021-01-16T17:11:02Z)
Stochastic Normalizing Flows [52.92110730286403]
We introduce normalizing flows for maximum likelihood estimation and variational inference (VI) using differential equations (SDEs) Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs. These SDEs can be used for constructing efficient chains to sample from the underlying distribution of a given dataset.
arXiv Detail & Related papers (2020-02-21T20:47:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.