Related papers: Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach

Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach

URL: http://arxiv.org/abs/2304.06549v2
Date: Mon, 26 Jun 2023 18:18:18 GMT
Title: Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach
Authors: Giacomo Greco, Maxence Noble, Giovanni Conforti, Alain Durmus
Abstract summary: We focus on a relaxation of the original OT problem, the entropic OT problem, which allows to implement efficient and practical algorithmic solutions. This formulation, also known as the Schr"odinger Bridge problem, notably connects with Optimal Control (SOC) and can be solved with the popular Sinkhorn algorithm.
Score: 10.568851068989972
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computational optimal transport (OT) has recently emerged as a powerful framework with applications in various fields. In this paper we focus on a relaxation of the original OT problem, the entropic OT problem, which allows to implement efficient and practical algorithmic solutions, even in high dimensional settings. This formulation, also known as the Schr\"odinger Bridge problem, notably connects with Stochastic Optimal Control (SOC) and can be solved with the popular Sinkhorn algorithm. In the case of discrete-state spaces, this algorithm is known to have exponential convergence; however, achieving a similar rate of convergence in a more general setting is still an active area of research. In this work, we analyze the convergence of the Sinkhorn algorithm for probability measures defined on the $d$-dimensional torus $\mathbb{T}_L^d$, that admit densities with respect to the Haar measure of $\mathbb{T}_L^d$. In particular, we prove pointwise exponential convergence of Sinkhorn iterates and their gradient. Our proof relies on the connection between these iterates and the evolution along the Hamilton-Jacobi-Bellman equations of value functions obtained from SOC-problems. Our approach is novel in that it is purely probabilistic and relies on coupling by reflection techniques for controlled diffusions on the torus.

Related papers

Accelerating Sinkhorn Algorithm with Sparse Newton Iterations [14.094908995798757]
We present Sinkhorn-Newton-Sparse (SNS), an extension to the Sinkhorn algorithm. SNS converges orders magnitude faster across a wide range of practical cases.
arXiv Detail & Related papers (2024-01-20T21:23:09Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients [13.848861021326755]
Mirror Descent Optimal Transport (MDOT) is a novel method for solving discrete optimal transport (OT) problems with high precision. We show that MDOT attains high-precision, feasible solutions significantly faster than a representative set of existing OT solvers.
arXiv Detail & Related papers (2023-07-17T14:09:43Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method. We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z)
Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization [7.0226402509856225]
Plug-and-Play () methods solve ill proximal-posed inverse problems through algorithms by replacing a neural network operator by a denoising operator. We show that this denoiser actually correspond to a gradient function.
arXiv Detail & Related papers (2022-01-31T14:05:20Z)
Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance [0.0]
In this paper, a general optimization procedure is studied, unifying several variants of the gradient descent such as, among others, the heavy ball method, the Nesterov Accelerated Gradient (S-NAG), and the widely used Adam method. The avoidance is studied as a noisy discretization of a non-autonomous ordinary differential equation.
arXiv Detail & Related papers (2020-12-07T19:14:49Z)
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms. We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients. We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z)
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems [75.58134963501094]
This paper analyzes the trajectories of gradient descent (SGD) We show that SGD avoids saddle points/manifolds with $1$ for strict step-size policies.
arXiv Detail & Related papers (2020-06-19T14:11:26Z)
The Convergence Indicator: Improved and completely characterized parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge. Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z)
S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs [16.96562173221624]
Decentralized convex optimization is proposed to minimize a sum of smooth and strongly cost functions when the functions are distributed over a directed network nodes. In particular, we propose thetextbftextttS-ADDOPT algorithm that assumes a first-order oracle at each node. For decaying step-sizes$mathcalO (1/k)$, we show thattextbfttS-ADDOPT reaches the exact solution subly at$mathcalO (1/k)$ and its convergence is networkally-independent
arXiv Detail & Related papers (2020-05-15T21:14:22Z)
Stochastic Approximation with Markov Noise: Analysis and applications in reinforcement learning [0.0]
We present for the first time an convergence analysis of two time-scale approximation driven by "controlled" Markov noise. We analyze the behavior of our framework by relating it to limiting differential inclusions in both time scales. We obtain the first informative error bounds on function approximation for the policy evaluation algorithm.
arXiv Detail & Related papers (2020-04-08T03:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.