Estimation of entropy-regularized optimal transport maps between
non-compactly supported measures
- URL: http://arxiv.org/abs/2311.11934v1
- Date: Mon, 20 Nov 2023 17:18:21 GMT
- Title: Estimation of entropy-regularized optimal transport maps between
non-compactly supported measures
- Authors: Matthew Werenski, James M. Murphy, Shuchin Aeron
- Abstract summary: This paper addresses the problem of estimating entropy-regularized optimal transport maps with squared-Euclidean cost between source and target measures that are subGaussian.
- Score: 15.857723276537248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the problem of estimating entropy-regularized optimal
transport (EOT) maps with squared-Euclidean cost between source and target
measures that are subGaussian. In the case that the target measure is compactly
supported or strongly log-concave, we show that for a recently proposed
in-sample estimator, the expected squared $L^2$-error decays at least as fast
as $O(n^{-1/3})$ where $n$ is the sample size. For the general subGaussian case
we show that the expected $L^1$-error decays at least as fast as $O(n^{-1/6})$,
and in both cases we have polynomial dependence on the regularization
parameter. While these results are suboptimal compared to known results in the
case of compactness of both the source and target measures (squared $L^2$-error
converging at a rate $O(n^{-1})$) and for when the source is subGaussian while
the target is compactly supported (squared $L^2$-error converging at a rate
$O(n^{-1/2})$), their importance lie in eliminating the compact support
requirements. The proof technique makes use of a bias-variance decomposition
where the variance is controlled using standard concentration of measure
results and the bias is handled by T1-transport inequalities along with sample
complexity results in estimation of EOT cost under subGaussian assumptions. Our
experimental results point to a looseness in controlling the variance terms and
we conclude by posing several open problems.
Related papers
- Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias [13.642712817536072]
We show that as the dimension $d$ of the problem increases, the number of iterations required to ensure convergence within a desired error increases.
A key technical challenge we address is the lack of a one-step contraction property in the $W_2,ellinfty$ metric to measure convergence.
arXiv Detail & Related papers (2024-08-20T01:24:54Z) - Score-based generative models are provably robust: an uncertainty quantification perspective [4.396860522241307]
We show that score-based generative models (SGMs) are provably robust to the multiple sources of error in practical implementation.
Our primary tool is the Wasserstein uncertainty propagation (WUP) theorem.
We show how errors due to (a) finite sample approximation, (b) early stopping, (c) score-matching objective choice, (d) score function parametrization, and (e) reference distribution choice, impact the quality of the generative model.
arXiv Detail & Related papers (2024-05-24T17:50:17Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Convergence of Gaussian-smoothed optimal transport distance with
sub-gamma distributions and dependent samples [12.77426855794452]
This paper provides convergence guarantees for estimating the GOT distance under more general settings.
A key step in our analysis is to show that the GOT distance is dominated by a family of kernel maximum mean discrepancy distances.
arXiv Detail & Related papers (2021-02-28T04:30:23Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$ [5.457150493905064]
We present a novel estimator with sub-Gaussian convergence.
Our estimator does not require prior knowledge of the variance.
Our estimator construction and analysis gives a framework generalizable to other problems.
arXiv Detail & Related papers (2020-11-17T02:47:24Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - New Bounds For Distributed Mean Estimation and Variance Reduction [25.815612182815702]
We consider the problem of distributed mean estimation (DME) in which $n$ machines are each given a local $d$-dimensional vector $x_v in mathbbRd$.
We show that our method yields practical improvements for common applications, relative to prior approaches.
arXiv Detail & Related papers (2020-02-21T13:27:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.