A Riemannian Block Coordinate Descent Method for Computing the
Projection Robust Wasserstein Distance
- URL: http://arxiv.org/abs/2012.05199v3
- Date: Wed, 13 Jan 2021 22:11:48 GMT
- Title: A Riemannian Block Coordinate Descent Method for Computing the
Projection Robust Wasserstein Distance
- Authors: Minhui Huang, Shiqian Ma and Lifeng Lai
- Abstract summary: The Wasserstein distance has become increasingly important in machine learning and deep learning.
A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data onto a lower-dimensional subspace, and then compute the Wasserstein distance between the projected data.
However, this approach requires to solve a max-min problem over the Stiefel manifold, which is very challenging in practice.
We propose a novel block coordinate descent (RBCD) method to solve this problem, which is based on a novel reformulation of the regularized max-min problem over the Stiefel manifold.
- Score: 36.97843660480747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Wasserstein distance has become increasingly important in machine
learning and deep learning. Despite its popularity, the Wasserstein distance is
hard to approximate because of the curse of dimensionality. A recently proposed
approach to alleviate the curse of dimensionality is to project the sampled
data from the high dimensional probability distribution onto a
lower-dimensional subspace, and then compute the Wasserstein distance between
the projected data. However, this approach requires to solve a max-min problem
over the Stiefel manifold, which is very challenging in practice. The only
existing work that solves this problem directly is the RGAS (Riemannian
Gradient Ascent with Sinkhorn Iteration) algorithm, which requires to solve an
entropy-regularized optimal transport problem in each iteration, and thus can
be costly for large-scale problems. In this paper, we propose a Riemannian
block coordinate descent (RBCD) method to solve this problem, which is based on
a novel reformulation of the regularized max-min problem over the Stiefel
manifold. We show that the complexity of arithmetic operations for RBCD to
obtain an $\epsilon$-stationary point is $O(\epsilon^{-3})$. This significantly
improves the corresponding complexity of RGAS, which is $O(\epsilon^{-12})$.
Moreover, our RBCD has very low per-iteration complexity, and hence is suitable
for large-scale problems. Numerical results on both synthetic and real datasets
demonstrate that our method is more efficient than existing methods, especially
when the number of sampled data is very large.
Related papers
- Fast Gradient Computation for Gromov-Wasserstein Distance [17.47684854121659]
The Gromov-Wasserstein distance is a notable extension of optimal transport.
The computation of the Gromov-Wasserstein distance and transport plan is expensive.
We propose a novel method to accelerate accurate gradient computation by dynamic programming techniques.
arXiv Detail & Related papers (2024-04-13T11:23:34Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Approximative Algorithms for Multi-Marginal Optimal Transport and
Free-Support Wasserstein Barycenters [0.0]
We present two algorithms to approximate the solution of multi-marginal optimal transport (MOT) with squared Euclidean costs for $N$ discrete probability measures.
They are fast, memory-efficient and easy to implement and can be used with any sparse OT solver as a black box.
arXiv Detail & Related papers (2022-02-02T10:59:54Z) - On the complexity of the optimal transport problem with graph-structured
cost [9.24979291231758]
Multi-marginal optimal transport (MOT) is a generalization of optimal transport to multiple marginals.
The usage of MOT has been largely impeded by its computational complexity which scales exponentially in the number of marginals.
arXiv Detail & Related papers (2021-10-01T19:29:59Z) - Partial Wasserstein Covering [10.52782170493037]
We consider a general task called partial Wasserstein covering with the goal of emulating a large dataset.
We model this problem as a discrete optimization problem with partial Wasserstein divergence as an objective function.
We show that we can efficiently make two datasets similar in terms of partial Wasserstein divergence, including driving scene datasets.
arXiv Detail & Related papers (2021-06-02T01:48:41Z) - Projection Robust Wasserstein Barycenter [36.97843660480747]
approximating the Wasserstein barycenter is numerically challenging because of the curse of dimensionality.
This paper proposes the projection robust Wasserstein barycenter (PRWB) that mitigates the curse of dimensionality.
arXiv Detail & Related papers (2021-02-05T19:23:35Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.