Asymptotics of Entropy-Regularized Optimal Transport via Chaos
Decomposition
- URL: http://arxiv.org/abs/2011.08963v1
- Date: Tue, 17 Nov 2020 21:55:46 GMT
- Title: Asymptotics of Entropy-Regularized Optimal Transport via Chaos
Decomposition
- Authors: Zaid Harchaoui, Lang Liu, Soumik Pal
- Abstract summary: This paper is on the properties of a discrete Schr"odinger bridge as $N$ tends to infinity.
We derive the first two error terms of orders $N-1/2$ and $N-1$, respectively.
The kernels corresponding to the first and second order chaoses are given by Markov operators which have natural interpretations in the Sinkhorn algorithm.
- Score: 1.7188280334580195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consider the problem of estimating the optimal coupling (i.e., matching)
between $N$ i.i.d. data points sampled from two densities $\rho_0$ and $\rho_1$
in $\mathbb{R}^d$. The cost of transport is an arbitrary continuous function
that satisfies suitable growth and integrability assumptions. For both
computational efficiency and smoothness, often a regularization term using
entropy is added to this discrete problem. We introduce a modification of the
commonly used discrete entropic regularization (Cuturi '13) such that the
optimal coupling for the regularized problem can be thought of as the static
Schr\"odinger bridge with $N$ particles. This paper is on the asymptotic
properties of this discrete Schr\"odinger bridge as $N$ tends to infinity. We
show that it converges to the continuum Schr\"odinger bridge and derive the
first two error terms of orders $N^{-1/2}$ and $N^{-1}$, respectively. This
gives us functional CLT, including for the cost of transport, and second order
Gaussian chaos limits when the limiting Gaussian variance is zero, extending
similar recent results derived for finite state spaces and the quadratic cost.
The proofs are based on a novel chaos decomposition of the discrete
Schr\"odinger bridge by polynomial functions of the pair of empirical
distributions as a first and second order Taylor approximations in the space of
measures. This is achieved by extending the Hoeffding decomposition from the
classical theory of U-statistics. The kernels corresponding to the first and
second order chaoses are given by Markov operators which have natural
interpretations in the Sinkhorn algorithm.
Related papers
- Optimal convergence rates in trace distance and relative entropy for the quantum central limit theorem [2.7855886538423182]
We show that for a centered $m$-mode quantum state with finite third-order moments, the trace distance between $rhoboxplus n$ and $rho_G$ decays at the optimal rate of $mathcalO(n-1/2)$.
For states with finite fourth-order moments, we prove that the relative entropy between $rhoboxplus n$ and $rho_G$ decays at the optimal rate of $mathcalO(n-1)$.
arXiv Detail & Related papers (2024-10-29T12:35:47Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting [64.0722630873758]
We consider rather general and broad class of Markov chains, Ito chains, that look like Euler-Maryama discretization of some Differential Equation.
We prove the bound in $W_2$-distance between the laws of our Ito chain and differential equation.
arXiv Detail & Related papers (2023-10-09T18:38:56Z) - Purity decay rate in random circuits with different configurations of
gates [0.0]
We study purity decay -- a measure for bipartite entanglement -- in a chain of $n$ under the action of various geometries.
In most circuits, purity decays to its value in two stages: the initial thermodynamically relevant decay up to extensive times is $sim lambda_mathrmeffeff$, with $lambda_mathrmeff$ not necessarily being in the spectrum of the transfer matrix.
arXiv Detail & Related papers (2022-11-24T12:32:07Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - The Franke-Gorini-Kossakowski-Lindblad-Sudarshan (FGKLS) Equation for
Two-Dimensional Systems [62.997667081978825]
Open quantum systems can obey the Franke-Gorini-Kossakowski-Lindblad-Sudarshan (FGKLS) equation.
We exhaustively study the case of a Hilbert space dimension of $2$.
arXiv Detail & Related papers (2022-04-16T07:03:54Z) - Nearly Tight Convergence Bounds for Semi-discrete Entropic Optimal
Transport [0.483420384410068]
We derive nearly tight and non-asymptotic convergence bounds for solutions of entropic semi-discrete optimal transport.
Our results also entail a non-asymptotic and tight expansion of the difference between the entropic and the unregularized costs.
arXiv Detail & Related papers (2021-10-25T06:52:45Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Convergence Rates of Stochastic Gradient Descent under Infinite Noise
Variance [14.06947898164194]
Heavy tails emerge in gradient descent (SGD) in various scenarios.
We provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance.
Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum.
arXiv Detail & Related papers (2021-02-20T13:45:11Z) - Estimating 2-Sinkhorn Divergence between Gaussian Processes from
Finite-Dimensional Marginals [4.416484585765028]
We study the convergence of estimating the 2-Sinkhorn divergence between emphGaussian processes (GPs) using their finite-dimensional marginal distributions.
We show almost sure convergence of the divergence when the marginals are sampled according to some base measure.
arXiv Detail & Related papers (2021-02-05T16:17:55Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.