Optimal transport with $f$-divergence regularization and generalized
Sinkhorn algorithm
- URL: http://arxiv.org/abs/2105.14337v1
- Date: Sat, 29 May 2021 16:37:31 GMT
- Title: Optimal transport with $f$-divergence regularization and generalized
Sinkhorn algorithm
- Authors: D\'avid Terj\'ek (1) and Diego Gonz\'alez-S\'anchez (1) ((1) Alfr\'ed
R\'enyi Institute of Mathematics)
- Abstract summary: Entropic regularization provides a generalization of the original optimal transport problem.
replacing the Kullback-Leibler divergence with a general $f$-divergence leads to a natural generalization.
We propose a practical algorithm for computing the regularized optimal transport cost and its gradient.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entropic regularization provides a generalization of the original optimal
transport problem. It introduces a penalty term defined by the Kullback-Leibler
divergence, making the problem more tractable via the celebrated Sinkhorn
algorithm. Replacing the Kullback-Leibler divergence with a general
$f$-divergence leads to a natural generalization. Using convex analysis, we
extend the theory developed so far to include $f$-divergences defined by
functions of Legendre type, and prove that under some mild conditions, strong
duality holds, optimums in both the primal and dual problems are attained, the
generalization of the $c$-transform is well-defined, and we give sufficient
conditions for the generalized Sinkhorn algorithm to converge to an optimal
solution. We propose a practical algorithm for computing the regularized
optimal transport cost and its gradient via the generalized Sinkhorn algorithm.
Finally, we present experimental results on synthetic 2-dimensional data,
demonstrating the effects of using different $f$-divergences for
regularization, which influences convergence speed, numerical stability and
sparsity of the optimal coupling.
Related papers
- Independently-Normalized SGD for Generalized-Smooth Nonconvex Optimization [19.000530691874516]
We show that many non machine learning problems meet that kind of condition that extends beyond traditional non-smoothepseps.
We propose an independently-normalized gradient descent algorithm, which leverages independent sampling and normalization.
arXiv Detail & Related papers (2024-10-17T21:52:00Z) - A Sinkhorn-type Algorithm for Constrained Optimal Transport [14.935188572016635]
This work focuses on a general class of OT problems under a combination of equality and inequality constraints.
We derive the corresponding entropy regularization formulation and introduce a Sinkhorn-type algorithm for such constrained OT problems supported by theoretical guarantees.
Overall, this work systematically combines recent theoretical and numerical advances in entropic optimal transport with the constrained case, allowing practitioners to derive approximate transport plans in complex scenarios.
arXiv Detail & Related papers (2024-03-08T05:01:43Z) - ALEXR: An Optimal Single-Loop Algorithm for Convex Finite-Sum Coupled Compositional Stochastic Optimization [53.14532968909759]
We introduce an efficient single-loop primal-dual block-coordinate algorithm, dubbed ALEXR.
We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions.
We present lower complexity bounds to demonstrate that the convergence rates of ALEXR are optimal among first-order block-coordinate algorithms for the considered class of cFCCO problems.
arXiv Detail & Related papers (2023-12-04T19:00:07Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Sharp global convergence guarantees for iterative nonconvex
optimization: A Gaussian process perspective [30.524043513721168]
We develop a general recipe for analyzing the convergence of iterative algorithms for a class of regression models.
deterministicly, we accurately capture both the convergence rate of the algorithm and the eventual error floor in the finite-sample regime.
We show sharp convergence rates for both higher-order algorithms based on alternating updates and first-order algorithms based on subgradient subgradients.
arXiv Detail & Related papers (2021-09-20T21:48:19Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Better Theory for SGD in the Nonconvex World [2.6397379133308214]
Large-scale non optimization problems are ubiquitous in modern machine learning.
We perform experiments on the effects of a wide array of synthetic minibatch sizes on the Gradient Descent (SG) problem.
arXiv Detail & Related papers (2020-02-09T09:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.