Stochastic Optimization for Regularized Wasserstein Estimators
- URL: http://arxiv.org/abs/2002.08695v1
- Date: Thu, 20 Feb 2020 12:04:05 GMT
- Title: Stochastic Optimization for Regularized Wasserstein Estimators
- Authors: Marin Ballu, Quentin Berthet, Francis Bach
- Abstract summary: We introduce an algorithm to solve a regularized version of the problem of Wasserstein estimators gradient, with a time per step which is sublinear in the natural dimensions.
We show that this algorithm can be extended to other tasks, including estimation of Wasserstein barycenters.
- Score: 10.194798773447879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal transport is a foundational problem in optimization, that allows to
compare probability distributions while taking into account geometric aspects.
Its optimal objective value, the Wasserstein distance, provides an important
loss between distributions that has been used in many applications throughout
machine learning and statistics. Recent algorithmic progress on this problem
and its regularized versions have made these tools increasingly popular.
However, existing techniques require solving an optimization problem to obtain
a single gradient of the loss, thus slowing down first-order methods to
minimize the sum of losses, that require many such gradient computations. In
this work, we introduce an algorithm to solve a regularized version of this
problem of Wasserstein estimators, with a time per step which is sublinear in
the natural dimensions of the problem. We introduce a dual formulation, and
optimize it with stochastic gradient steps that can be computed directly from
samples, without solving additional optimization problems at each step. Doing
so, the estimation and computation tasks are performed jointly. We show that
this algorithm can be extended to other tasks, including estimation of
Wasserstein barycenters. We provide theoretical guarantees and illustrate the
performance of our algorithm with experiments on synthetic data.
Related papers
- Fast Screening Rules for Optimal Design via Quadratic Lasso
Reformulation [0.135975510645475]
In this work, we derive safe screening rules that can be used to discard inessential samples.
The new tests are much faster to compute, especially for problems involving a parameter space of high dimension.
We show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $ell_$-penalty.
arXiv Detail & Related papers (2023-10-13T08:10:46Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - A framework for bilevel optimization that enables stochastic and global
variance reduction algorithms [17.12280360174073]
Bilevel optimization is a problem of minimizing a value function which involves the arg-minimum of another function.
We introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time.
We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(frac1T)$ convergence rate, and that it achieves linear convergence under Polyak-Lojasciewicz assumption.
arXiv Detail & Related papers (2022-01-31T18:17:25Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z) - Learning to solve TV regularized problems with unrolled algorithms [18.241062505073234]
Total Variation (TV) is a popular regularization strategy that promotes piece-wise constant signals.
We develop and characterize two approaches to do so, describe their benefits and limitations, and discuss the regime where they can actually improve over iterative procedures.
arXiv Detail & Related papers (2020-10-19T14:19:02Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.