Related papers: Properties of Discrete Sliced Wasserstein Losses

Properties of Discrete Sliced Wasserstein Losses

URL: http://arxiv.org/abs/2307.10352v6
Date: Mon, 8 Jul 2024 12:52:12 GMT
Title: Properties of Discrete Sliced Wasserstein Losses
Authors: Eloi Tanguy, Rémi Flamary, Julie Delon,
Abstract summary: The Sliced Wasserstein (SW) distance has become a popular alternative to the Wasserstein distance for comparing probability measures. Widespread applications include image processing, domain adaptation and generative modelling, where it is common to optimise some parameters in order to minimise SW. We investigate the regularity and optimisation properties of this energy, as well as its Monte-Carlo approximation $mathcalE_p$.
Score: 11.280151521887076
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Sliced Wasserstein (SW) distance has become a popular alternative to the Wasserstein distance for comparing probability measures. Widespread applications include image processing, domain adaptation and generative modelling, where it is common to optimise some parameters in order to minimise SW, which serves as a loss function between discrete probability measures (since measures admitting densities are numerically unattainable). All these optimisation problems bear the same sub-problem, which is minimising the Sliced Wasserstein energy. In this paper we study the properties of $\mathcal{E}: Y \longmapsto \mathrm{SW}_2^2(\gamma_Y, \gamma_Z)$, i.e. the SW distance between two uniform discrete measures with the same amount of points as a function of the support $Y \in \mathbb{R}^{n \times d}$ of one of the measures. We investigate the regularity and optimisation properties of this energy, as well as its Monte-Carlo approximation $\mathcal{E}_p$ (estimating the expectation in SW using only $p$ samples) and show convergence results on the critical points of $\mathcal{E}_p$ to those of $\mathcal{E}$, as well as an almost-sure uniform convergence and a uniform Central Limit result on the process $\mathcal{E}_p(Y)$. Finally, we show that in a certain sense, Stochastic Gradient Descent methods minimising $\mathcal{E}$ and $\mathcal{E}_p$ converge towards (Clarke) critical points of these energies.

Related papers

Nearly Optimal Robust Covariance and Scatter Matrix Estimation Beyond Gaussians [2.311583680973075]
We study the problem of computationally efficient robust estimation of the covariance/scatter matrix of elliptical distributions. We obtain the first efficiently computable, nearly optimal robust covariance estimators that extend beyond the Gaussian case.
arXiv Detail & Related papers (2025-02-10T15:31:57Z)
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation [22.652143194356864]
We address the problem of solving strongly convex and smooth problems using a descent gradient (SGD) algorithm with a constant step size. We provide an expansion of the mean-squared error of the resulting estimator with respect to the number iterations of $n$. We establish that this chain is geometrically ergodic with respect to a defined weighted Wasserstein semimetric.
arXiv Detail & Related papers (2024-10-07T15:02:48Z)
Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$) We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z)
Wasserstein Gradient Flows of MMD Functionals with Distance Kernel and Cauchy Problems on Quantile Functions [2.3301643766310374]
We describe Wasserstein gradient flows of maximum mean discrepancy functionals $mathcal F_nu := textMMD_K2(cdot, nu)$ towards given target measures $nu$ on the real line. For certain $mathcal F_nu$-flows this implies that initial point measures instantly become absolutely continuous, and stay so over time.
arXiv Detail & Related papers (2024-08-14T12:28:21Z)
Max-sliced Wasserstein concentration and uniform ratio bounds of empirical measures on RKHS [9.783697404304025]
Optimal transport and the Wasserstein distance $mathcalW_p$ have recently seen a number of applications in the fields of statistics, machine learning, data science, and the physical sciences. These applications are however severely restricted by the curse of dimensionality, meaning that the number of data points needed to estimate these problems accurately increases exponentially in the dimension. We focus here on one of these variants, namely the max-sliced Wasserstein metric $overlinemathcalW_p$.
arXiv Detail & Related papers (2024-05-21T18:47:43Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Transfer Operators from Batches of Unpaired Points via Entropic Transport Kernels [3.099885205621181]
We derive a maximum-likelihood inference functional, propose a computationally tractable approximation and analyze their properties. We prove a $Gamma$-convergence result showing that we can recover the true density from empirical approximations as the number $N$ of blocks goes to infinity.
arXiv Detail & Related papers (2024-02-13T12:52:41Z)
The Sample Complexity Of ERMs In Stochastic Convex Optimization [13.896417716930687]
We show that in fact $tildeO(fracdepsilon+frac1epsilon2)$ data points are also sufficient. We further generalize the result and show that a similar upper bound holds for all convex bodies.
arXiv Detail & Related papers (2023-11-09T14:29:25Z)
An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z)
Estimating 2-Sinkhorn Divergence between Gaussian Processes from Finite-Dimensional Marginals [4.416484585765028]
We study the convergence of estimating the 2-Sinkhorn divergence between emphGaussian processes (GPs) using their finite-dimensional marginal distributions. We show almost sure convergence of the divergence when the marginals are sampled according to some base measure.
arXiv Detail & Related papers (2021-02-05T16:17:55Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP) We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
A diffusion approach to Stein's method on Riemannian manifolds [65.36007959755302]
We exploit the relationship between the generator of a diffusion on $mathbf M$ with target invariant measure and its characterising Stein operator. We derive Stein factors, which bound the solution to the Stein equation and its derivatives. We imply that the bounds for $mathbb Rm$ remain valid when $mathbf M$ is a flat manifold.
arXiv Detail & Related papers (2020-03-25T17:03:58Z)
Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations. Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic. We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.