A semiconcavity approach to stability of entropic plans and exponential convergence of Sinkhorn's algorithm
- URL: http://arxiv.org/abs/2412.09235v2
- Date: Fri, 03 Oct 2025 09:20:09 GMT
- Title: A semiconcavity approach to stability of entropic plans and exponential convergence of Sinkhorn's algorithm
- Authors: Alberto Chiarini, Giovanni Conforti, Giacomo Greco, Luca Tamanini,
- Abstract summary: We study stability of bounds and convergence of Sinkhorn's algorithm for the entropic optimal transport problem.<n>New applications include subspace elastic costs, weakly log-concave marginals, marginals with light tails.
- Score: 3.686530147760242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study stability of optimizers and convergence of Sinkhorn's algorithm for the entropic optimal transport problem. In the special case of the quadratic cost, our stability bounds imply that if one of the two entropic potentials is semiconcave, then the relative entropy between optimal plans is controlled by the squared Wasserstein distance between their marginals. When employed in the analysis of Sinkhorn's algorithm, this result gives a natural sufficient condition for its exponential convergence, which does not require the ground cost to be bounded. By controlling from above the Hessians of Sinkhorn potentials in examples of interest, we obtain new exponential convergence results. For instance, for the first time we obtain exponential convergence for log-concave marginals and quadratic costs for all values of the regularization parameter, based on semiconcavity propagation results. Moreover, the convergence rate has a linear dependence on the regularization: this behavior is sharp and had only been previously obtained for compact distributions arXiv:2407.01202. These optimal rates are also established in situations where one of the two marginals does not have subgaussian tails. Other interesting new applications include subspace elastic costs, weakly log-concave marginals, marginals with light tails (where, under reinforced assumptions, we manage to improve the rates obtained in arXiv:2311.04041), the case of Lipschitz costs with bounded Hessian, and compact Riemannian manifolds.
Related papers
- Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs [55.77845440440496]
Push-based decentralized communication enables optimization over communication networks, where information exchange may be asymmetric.<n>We develop a unified uniform-stability framework for the Gradient Push (SGP) algorithm.<n>A key technical ingredient is an imbalance-aware generalization bound through two quantities.
arXiv Detail & Related papers (2026-02-24T05:32:03Z) - From Tail Universality to Bernstein-von Mises: A Unified Statistical Theory of Semi-Implicit Variational Inference [0.12183405753834557]
Semi-implicit variational inference (SIVI) constructs approximate posteriors of the form $q() = int k(| z) r(dz)$<n>This paper develops a unified "approximation-optimization-statistics'' theory for such families.
arXiv Detail & Related papers (2025-12-05T19:26:25Z) - Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations [57.179679246370114]
We identify the distribution of random perturbations that minimizes the estimator's variance as the perturbation stepsize tends to zero.<n>Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length.
arXiv Detail & Related papers (2025-10-22T19:06:39Z) - Quantum $f$-divergences and Their Local Behaviour: An Analysis via Relative Expansion Coefficients [4.30484058393522]
We study contraction and expansion coefficients, which can be combined into a single emphrelative expansion coefficient<n>We identify new families of $f$ for which the global ($f$ divergence) and local (Riemannian) relative expansion coefficients coincide for every pair of channels.<n>We prove a reverse quantum Markov convergence theorem, converting positive expansion coefficients into quantitative lower bounds on the convergence rate.
arXiv Detail & Related papers (2025-10-07T17:44:37Z) - Graph-based Clustering Revisited: A Relaxation of Kernel $k$-Means Perspective [73.18641268511318]
We propose a graph-based clustering algorithm that only relaxes the orthonormal constraint to derive clustering results.<n>To ensure a doubly constraint into a gradient, we transform the non-negative constraint into a class probability parameter.
arXiv Detail & Related papers (2025-09-23T09:14:39Z) - Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity [4.604003661048267]
This is the first work addressing this second-order quantitative stability estimate in general unbounded settings.
We deduce exponential convergence rates for gradient and Hessian of Sinkhorns along Sinkhorn's algorithm.
arXiv Detail & Related papers (2025-04-15T12:34:09Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Convergence of Adam Under Relaxed Assumptions [72.24779199744954]
We show that Adam converges to $epsilon$-stationary points with $O(epsilon-4)$ gradient complexity under far more realistic conditions.
We also propose a variance-reduced version of Adam with an accelerated gradient complexity of $O(epsilon-3)$.
arXiv Detail & Related papers (2023-04-27T06:27:37Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - On the Convergence of Semi-Relaxed Sinkhorn with Marginal Constraint and
OT Distance Gaps [20.661025590877774]
Semi-Relaxed Sinkhorn (SR-Sinkhorn) is an algorithm for the semi-relaxed optimal transport (SROT) problem.
This paper presents a comprehensive convergence analysis for SR-Sinkhorn.
arXiv Detail & Related papers (2022-05-27T09:19:05Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Nearly Tight Convergence Bounds for Semi-discrete Entropic Optimal
Transport [0.483420384410068]
We derive nearly tight and non-asymptotic convergence bounds for solutions of entropic semi-discrete optimal transport.
Our results also entail a non-asymptotic and tight expansion of the difference between the entropic and the unregularized costs.
arXiv Detail & Related papers (2021-10-25T06:52:45Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Optimal transport with $f$-divergence regularization and generalized
Sinkhorn algorithm [0.0]
Entropic regularization provides a generalization of the original optimal transport problem.
replacing the Kullback-Leibler divergence with a general $f$-divergence leads to a natural generalization.
We propose a practical algorithm for computing the regularized optimal transport cost and its gradient.
arXiv Detail & Related papers (2021-05-29T16:37:31Z) - Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU)
We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant.
We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z) - Better Theory for SGD in the Nonconvex World [2.6397379133308214]
Large-scale non optimization problems are ubiquitous in modern machine learning.
We perform experiments on the effects of a wide array of synthetic minibatch sizes on the Gradient Descent (SG) problem.
arXiv Detail & Related papers (2020-02-09T09:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.