Debiased Sinkhorn barycenters
- URL: http://arxiv.org/abs/2006.02575v1
- Date: Wed, 3 Jun 2020 23:06:02 GMT
- Title: Debiased Sinkhorn barycenters
- Authors: Hicham Janati, Marco Cuturi, Alexandre Gramfort
- Abstract summary: Entropy regularization in optimal transport (OT) has been the driver of many recent interests for Wasserstein metrics and barycenters in machine learning.
We show how this bias is tightly linked to the reference measure that defines the entropy regularizer.
We propose debiased Wasserstein barycenters that preserve the best of both worlds: fast Sinkhorn-like iterations without entropy smoothing.
- Score: 110.79706180350507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entropy regularization in optimal transport (OT) has been the driver of many
recent interests for Wasserstein metrics and barycenters in machine learning.
It allows to keep the appealing geometrical properties of the unregularized
Wasserstein distance while having a significantly lower complexity thanks to
Sinkhorn's algorithm. However, entropy brings some inherent smoothing bias,
resulting for example in blurred barycenters. This side effect has prompted an
increasing temptation in the community to settle for a slower algorithm such as
log-domain stabilized Sinkhorn which breaks the parallel structure that can be
leveraged on GPUs, or even go back to unregularized OT. Here we show how this
bias is tightly linked to the reference measure that defines the entropy
regularizer and propose debiased Wasserstein barycenters that preserve the best
of both worlds: fast Sinkhorn-like iterations without entropy smoothing.
Theoretically, we prove that the entropic OT barycenter of univariate Gaussians
is a Gaussian and quantify its variance bias. This result is obtained by
extending the differentiability and convexity of entropic OT to sub-Gaussian
measures with unbounded supports. Empirically, we illustrate the reduced
blurring and the computational advantage on various applications.
Related papers
- Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization [38.67914746910537]
We prove an $mathcalO(t-1)$ lower bound rate for the OT map, using the similarity between Laguerre cells estimation and density support estimation.
To nearly achieve the desired fast rate, we design an entropic regularization scheme decreasing with the number of samples.
arXiv Detail & Related papers (2024-05-23T11:46:03Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Solving High Frequency and Multi-Scale PDEs with Gaussian Processes [18.190228010565367]
PINNs often struggle to solve high-frequency and multi-scale PDEs.
We resort to the Gaussian process (GP) framework to solve this problem.
We use Kronecker product properties and multilinear algebra to promote computational efficiency and scalability.
arXiv Detail & Related papers (2023-11-08T05:26:58Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Computational Guarantees for Doubly Entropic Wasserstein Barycenters via
Damped Sinkhorn Iterations [0.0]
We propose and analyze an algorithm for computing doubly regularized Wasserstein barycenters.
An inexact variant of our algorithm, implementable using approximate Monte Carlo sampling, offers the first non-asymptotic convergence guarantees.
arXiv Detail & Related papers (2023-07-25T09:42:31Z) - Doubly Regularized Entropic Wasserstein Barycenters [0.0]
We study a general formulation of regularized Wasserstein barycenters that enjoys favorable regularity, approximation, stability and (grid-free) optimization properties.
arXiv Detail & Related papers (2023-03-21T13:42:43Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z) - Estimating 2-Sinkhorn Divergence between Gaussian Processes from
Finite-Dimensional Marginals [4.416484585765028]
We study the convergence of estimating the 2-Sinkhorn divergence between emphGaussian processes (GPs) using their finite-dimensional marginal distributions.
We show almost sure convergence of the divergence when the marginals are sampled according to some base measure.
arXiv Detail & Related papers (2021-02-05T16:17:55Z) - Sinkhorn Barycenter via Functional Gradient Descent [125.89871274202439]
We consider the problem of computing the barycenter of a set of probability distributions under the Sinkhorn divergence.
This problem has recently found applications across various domains, including graphics, learning, and vision.
We develop a novel functional gradient descent method named Sinkhorn Descent.
arXiv Detail & Related papers (2020-07-20T20:16:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.