Contraction and entropy production in continuous-time Sinkhorn dynamics
- URL: http://arxiv.org/abs/2510.12639v1
- Date: Tue, 14 Oct 2025 15:32:15 GMT
- Title: Contraction and entropy production in continuous-time Sinkhorn dynamics
- Authors: Anand Srinivasan, Jean-Jacques Slotine,
- Abstract summary: We give an exact identity for the entropy production rate of the Sinkhorn flow, which was previously known only to be nonpositive.<n>We show that the flow induces a reversible Markov dynamics on the target marginal as an Onsager gradient flow.<n>We give for illustration two immediate practical use-cases for the Sinkhorn LSI: as a design principle for the latent space in which generative models are trained, and as a stopping algorithm for discrete-time algorithms.
- Score: 0.6423239719448169
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the vanishing-step-size limit of the Sinkhorn algorithm at finite regularization parameter $\varepsilon$ was shown to be a mirror descent in the space of probability measures. We give $L^2$ contraction criteria in two time-dependent metrics induced by the mirror Hessian, which reduce to the coercivity of certain conditional expectation operators. We then give an exact identity for the entropy production rate of the Sinkhorn flow, which was previously known only to be nonpositive. Examining this rate shows that the standard semigroup analysis of diffusion processes extends systematically to the Sinkhorn flow. We show that the flow induces a reversible Markov dynamics on the target marginal as an Onsager gradient flow. We define the Dirichlet form associated to its (nonlocal) infinitesimal generator, prove a Poincar\'e inequality for it, and show that the spectral gap is strictly positive along the Sinkhorn flow whenever $\varepsilon > 0$. Lastly, we show that the entropy decay is exponential if and only if a logarithmic Sobolev inequality (LSI) holds. We give for illustration two immediate practical use-cases for the Sinkhorn LSI: as a design principle for the latent space in which generative models are trained, and as a stopping heuristic for discrete-time algorithms.
Related papers
- Deep Sequence Modeling with Quantum Dynamics: Language as a Wave Function [0.0]
We introduce a sequence modeling framework in which the latent state is a complex-valued wave function evolving on a finite-dimensional Hilbert space under a learned, time-dependent Hamiltonian.<n> Token probabilities are extracted using the Born rule, a quadratic measurement operator that couples magnitudes and relative phases.<n>We derive a continuity equation for the latent probability mass, yielding conserved pairwise currents that serve as a built-in diagnostic.
arXiv Detail & Related papers (2026-02-24T23:42:18Z) - Stabilizing Fixed-Point Iteration for Markov Chain Poisson Equations [49.702772230127465]
We study finite-state Markov chains with $n$ states and transition matrix $P$.<n>We show that all non-decaying modes are captured by a real peripheral invariant subspace $mathcalK(P)$, and that the induced operator on the quotient space $mathbbRn/mathcalK(P) is strictly contractive, yielding a unique quotient solution.
arXiv Detail & Related papers (2026-01-31T02:57:01Z) - A semiconcavity approach to stability of entropic plans and exponential convergence of Sinkhorn's algorithm [3.686530147760242]
We study stability of bounds and convergence of Sinkhorn's algorithm for the entropic optimal transport problem.<n>New applications include subspace elastic costs, weakly log-concave marginals, marginals with light tails.
arXiv Detail & Related papers (2024-12-12T12:45:31Z) - Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler [14.34147140416535]
We study the mixing time of two popular discrete-time Markov chains in continuous space.<n>We show that any $Phi$-divergence arising from a twice-differentiable strictly convex function $Phi$ converges to $0$ exponentially fast along these Markov chains.
arXiv Detail & Related papers (2024-10-14T16:41:45Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.<n>Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z) - Sinkhorn Flow: A Continuous-Time Framework for Understanding and
Generalizing the Sinkhorn Algorithm [49.45427072226592]
We introduce a continuous-time analogue of the Sinkhorn algorithm.
This perspective allows us to derive novel variants of Sinkhorn schemes that are robust to noise and bias.
arXiv Detail & Related papers (2023-11-28T11:29:12Z) - Wasserstein Mirror Gradient Flow as the limit of the Sinkhorn Algorithm [2.7240657895633436]
We prove that the sequence of marginals obtained from the iterations of the Sinkhorn algorithm converges to an absolutely continuous curve on the $2$-Wasserstein space.<n>This limit, which we call the Sinkhorn flow, is an example of a Wasserstein mirror gradient flow.
arXiv Detail & Related papers (2023-07-31T06:11:47Z) - Estimating 2-Sinkhorn Divergence between Gaussian Processes from
Finite-Dimensional Marginals [4.416484585765028]
We study the convergence of estimating the 2-Sinkhorn divergence between emphGaussian processes (GPs) using their finite-dimensional marginal distributions.
We show almost sure convergence of the divergence when the marginals are sampled according to some base measure.
arXiv Detail & Related papers (2021-02-05T16:17:55Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Debiased Sinkhorn barycenters [110.79706180350507]
Entropy regularization in optimal transport (OT) has been the driver of many recent interests for Wasserstein metrics and barycenters in machine learning.
We show how this bias is tightly linked to the reference measure that defines the entropy regularizer.
We propose debiased Wasserstein barycenters that preserve the best of both worlds: fast Sinkhorn-like iterations without entropy smoothing.
arXiv Detail & Related papers (2020-06-03T23:06:02Z) - Continuous-time quantum walks in the presence of a quadratic
perturbation [55.41644538483948]
We address the properties of continuous-time quantum walks with Hamiltonians of the form $mathcalH= L + lambda L2$.
We consider cycle, complete, and star graphs because paradigmatic models with low/high connectivity and/or symmetry.
arXiv Detail & Related papers (2020-05-13T14:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.