A Push-Relabel Based Additive Approximation for Optimal Transport
- URL: http://arxiv.org/abs/2203.03732v1
- Date: Mon, 7 Mar 2022 21:40:14 GMT
- Title: A Push-Relabel Based Additive Approximation for Optimal Transport
- Authors: Nathaniel Lahn, Sharath Raghvendra, Kaiyi Zhang
- Abstract summary: Exact algorithms for computing Optimal Transport can be slow.
We introduce a new and very simple approach to find an $varepsilon$approximation of the OT distance.
Our algorithm achieves a near-optimal execution time of $O(n2/varepsilon2)$ for computing OT distance.
- Score: 5.111364864495785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal Transport is a popular distance metric for measuring similarity
between distributions. Exact algorithms for computing Optimal Transport can be
slow, which has motivated the development of approximate numerical solvers
(e.g. Sinkhorn method). We introduce a new and very simple combinatorial
approach to find an $\varepsilon$-approximation of the OT distance. Our
algorithm achieves a near-optimal execution time of $O(n^2/\varepsilon^2)$ for
computing OT distance and, for the special case of the assignment problem, the
execution time improves to $O(n^2/\varepsilon)$. Our algorithm is based on the
push-relabel framework for min-cost flow problems.
Unlike the other combinatorial approach (Lahn, Mulchandani and Raghvendra,
NeurIPS 2019) which does not have a fast parallel implementation, our algorithm
has a parallel execution time of $O(\log n/\varepsilon^2)$. Interestingly,
unlike the Sinkhorn algorithm, our method also readily provides a compact
transport plan as well as a solution to an approximate version of the dual
formulation of the OT problem, both of which have numerous applications in
Machine Learning. For the assignment problem, we provide both a CPU
implementation as well as an implementation that exploits GPU parallelism.
Experiments suggest that our algorithm is faster than the Sinkhorn algorithm,
both in terms of CPU and GPU implementations, especially while computing
matchings with a high accuracy.
Related papers
- An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - A One-Sample Decentralized Proximal Algorithm for Non-Convex Stochastic
Composite Optimization [10.762749887051546]
We propose two-time scale algorithms: ProxDAS-A and Proxcal$DASA-GT.
Unlike prior work, our algorithms achieve comparable complexity without requiring large batch sizes, more complex per-it operations, or stronger assumptions.
arXiv Detail & Related papers (2023-02-20T05:16:18Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Improved Rate of First Order Algorithms for Entropic Optimal Transport [2.1485350418225244]
This paper improves the state-of-the-art rate of a first-order algorithm for solving entropy regularized optimal transport.
We propose an accelerated primal-dual mirror descent algorithm with variance reduction.
Our algorithm may inspire more research to develop accelerated primal-dual algorithms that have rate $widetildeO(n2/epsilon)$ for solving OT.
arXiv Detail & Related papers (2023-01-23T19:13:25Z) - Review of Serial and Parallel Min-Cut/Max-Flow Algorithms for Computer
Vision [6.574107319036238]
Hochbaum pseudoflow algorithm is fastest serial algorithm, Boykov-Kolmogorov algorithm is most memory efficient.
Existing parallel min-cut/max-flow algorithms can significantly outperform serial algorithms on large problems but suffers from added overhead on small to medium problems.
arXiv Detail & Related papers (2022-02-01T14:06:27Z) - RAMA: A Rapid Multicut Algorithm on GPU [23.281726932718232]
We propose a highly parallel primal-dual algorithm for the multicut (a.k.a.magnitude correlation clustering) problem.
Our algorithm produces primal solutions and dual lower bounds that estimate the distance to optimum.
We can solve very large scale benchmark problems with up to $mathcalO(108)$ variables in a few seconds with small primal-dual gaps.
arXiv Detail & Related papers (2021-09-04T10:33:59Z) - Asynchronous Stochastic Optimization Robust to Arbitrary Delays [54.61797739710608]
We consider optimization with delayed gradients where, at each time stept$, the algorithm makes an update using a stale computation - d_t$ for arbitrary delay $d_t gradient.
Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
arXiv Detail & Related papers (2021-06-22T15:50:45Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and
Hierarchical Spatial Clustering [6.4805900740861]
We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$*$.
We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time)
Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.
arXiv Detail & Related papers (2021-04-02T16:05:00Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.