Stochastic Approximation versus Sample Average Approximation for
population Wasserstein barycenters
- URL: http://arxiv.org/abs/2001.07697v9
- Date: Mon, 25 Oct 2021 14:16:23 GMT
- Title: Stochastic Approximation versus Sample Average Approximation for
population Wasserstein barycenters
- Authors: Darina Dvinskikh
- Abstract summary: In the machine learning and optimization community, there are two main approaches for the convex risk problem, namely, the minimization Approximation (SA) and the Sample Average Approximation (SAA)
We show that for the Wasserstein barycenter problem this superiority can be inverted.
We provide a detailed comparison by stating the complexity bounds for the SA and the SAA implementations calculating barycenters defined with respect to optimal transport distances and entropy-regularized optimal transport distances.
- Score: 2.3025186469300434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the machine learning and optimization community, there are two main
approaches for the convex risk minimization problem, namely, the Stochastic
Approximation (SA) and the Sample Average Approximation (SAA). In terms of
oracle complexity (required number of stochastic gradient evaluations), both
approaches are considered equivalent on average (up to a logarithmic factor).
The total complexity depends on the specific problem, however, starting from
work \cite{nemirovski2009robust} it was generally accepted that the SA is
better than the SAA. % Nevertheless, in case of large-scale problems SA may run
out of memory as storing all data on one machine and organizing online access
to it can be impossible without communications with other machines. SAA in
contradistinction to SA allows parallel/distributed calculations. We show that
for the Wasserstein barycenter problem this superiority can be inverted. We
provide a detailed comparison by stating the complexity bounds for the SA and
the SAA implementations calculating barycenters defined with respect to optimal
transport distances and entropy-regularized optimal transport distances. As a
byproduct, we also construct confidence intervals for the barycenter defined
with respect to entropy-regularized optimal transport distances in the
$\ell_2$-norm. The preliminary results are derived for a general convex
optimization problem given by the expectation in order to have other
applications besides the Wasserstein barycenter problem.
Related papers
- Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport [84.51977664336056]
We propose a novel, scalable approach for estimating the textitrobust continuous barycenter.
Our method is framed as a $min$-$max$ optimization problem and is adaptable to textitgeneral cost function.
arXiv Detail & Related papers (2024-10-04T23:27:33Z) - Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems.
We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z) - Mean field initialization of the Annealed Importance Sampling algorithm for an efficient evaluation of the Partition Function of Restricted Boltzmann Machines [0.0]
Annealed Importance Sampling (AIS) is a tool to estimate the partition function of a system.
We show that both the quality of the estimation and the cost of the computation can be significantly improved by using a properly selected mean-field starting probability distribution.
We conclude that these are good starting points to estimate the partition function with AIS with a relatively low computational cost.
arXiv Detail & Related papers (2024-04-17T10:22:03Z) - Metric Entropy-Free Sample Complexity Bounds for Sample Average Approximation in Convex Stochastic Programming [0.6906005491572401]
This paper studies sample average approximation (SAA) in solving convex or strongly convex programming (SP) problems.
We show -- perhaps for the first time -- that SAA's sample complexity can be completely free from any quantification of metric entropy.
arXiv Detail & Related papers (2024-01-01T04:35:53Z) - Improved dimension dependence of a proximal algorithm for sampling [16.147290924171692]
We propose a sampling algorithm that achieves superior complexity bounds in all the classical settings.
Our algorithm is based on the proximal sampler introduced incitetlee 2021.
For strongly log-concave distributions, our method has complexity bound $tildemathcalO(kappa d1/2)$ without warm start.
For distributions satisfying the LSI, our bound is $tilde mathcalO(hat kappa d1/2)$ where $hat kappa$ is the ratio between smoothness and
arXiv Detail & Related papers (2023-02-20T16:44:48Z) - Statistical, Robustness, and Computational Guarantees for Sliced
Wasserstein Distances [18.9717974398864]
Sliced Wasserstein distances preserve properties of classic Wasserstein distances while being more scalable for computation and estimation in high dimensions.
We quantify this scalability from three key aspects: (i) empirical convergence rates; (ii) robustness to data contamination; and (iii) efficient computational methods.
arXiv Detail & Related papers (2022-10-17T15:04:51Z) - Near-optimal estimation of smooth transport maps with kernel
sums-of-squares [81.02564078640275]
Under smoothness conditions, the squared Wasserstein distance between two distributions could be efficiently computed with appealing statistical error upper bounds.
The object of interest for applications such as generative modeling is the underlying optimal transport map.
We propose the first tractable algorithm for which the statistical $L2$ error on the maps nearly matches the existing minimax lower-bounds for smooth map estimation.
arXiv Detail & Related papers (2021-12-03T13:45:36Z) - Continuous Regularized Wasserstein Barycenters [51.620781112674024]
We introduce a new dual formulation for the regularized Wasserstein barycenter problem.
We establish strong duality and use the corresponding primal-dual relationship to parametrize the barycenter implicitly using the dual potentials of regularized transport problems.
arXiv Detail & Related papers (2020-08-28T08:28:06Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.