On Unbalanced Optimal Transport: Gradient Methods, Sparsity and
Approximation Error
- URL: http://arxiv.org/abs/2202.03618v4
- Date: Mon, 8 Jan 2024 06:10:07 GMT
- Title: On Unbalanced Optimal Transport: Gradient Methods, Sparsity and
Approximation Error
- Authors: Quang Minh Nguyen, Hoang H. Nguyen, Yi Zhou, Lam M. Nguyen
- Abstract summary: We study the Unbalanced Optimal Transport (UOT) between two measures of possibly different masses with at most $n$ components.
We propose a novel algorithm based on Gradient Extrapolation Method (GEM-UOT) to find an $varepsilon$-approximate solution to the UOT problem.
- Score: 18.19398247972205
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We study the Unbalanced Optimal Transport (UOT) between two measures of
possibly different masses with at most $n$ components, where the marginal
constraints of standard Optimal Transport (OT) are relaxed via Kullback-Leibler
divergence with regularization factor $\tau$. Although only Sinkhorn-based UOT
solvers have been analyzed in the literature with the iteration complexity of
${O}\big(\tfrac{\tau \log(n)}{\varepsilon}
\log\big(\tfrac{\log(n)}{{\varepsilon}}\big)\big)$ and per-iteration cost of
$O(n^2)$ for achieving the desired error $\varepsilon$, their positively dense
output transportation plans strongly hinder the practicality. On the other
hand, while being vastly used as heuristics for computing UOT in modern deep
learning applications and having shown success in sparse OT problem, gradient
methods applied to UOT have not been formally studied. In this paper, we
propose a novel algorithm based on Gradient Extrapolation Method (GEM-UOT) to
find an $\varepsilon$-approximate solution to the UOT problem in $O\big( \kappa
\log\big(\frac{\tau n}{\varepsilon}\big) \big)$ iterations with
$\widetilde{O}(n^2)$ per-iteration cost, where $\kappa$ is the condition number
depending on only the two input measures. Our proof technique is based on a
novel dual formulation of the squared $\ell_2$-norm UOT objective, which fills
the lack of sparse UOT literature and also leads to a new characterization of
approximation error between UOT and OT. To this end, we further present a novel
approach of OT retrieval from UOT, which is based on GEM-UOT with fine tuned
$\tau$ and a post-process projection step. Extensive experiments on synthetic
and real datasets validate our theories and demonstrate the favorable
performance of our methods in practice.
Related papers
- Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems.
Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z) - Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization [48.84672493756553]
We propose a query-efficient and accurate estimator for gradients that uses only $Obig(slogfrac dsbig)$ queries per step.
Our proposed GraCe generalizes the Indyk--Price--Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions.
arXiv Detail & Related papers (2024-05-27T03:52:53Z) - On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and
Efficient Gradient Methods [17.14725907264431]
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports.
We propose a novel rounding algorithm for POT, and then provide a feasible Sinkhorn procedure with a revised complexity of $mathcalwidetilde O(n2/varepsilon4)$.
arXiv Detail & Related papers (2023-12-21T15:56:09Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
Markov Decision Processes [62.90204655228324]
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision Processes (LMDPs) with context in hindsight.
We design a novel model-based algorithmic framework which can be instantiated with both a model-optimistic and a value-optimistic solver.
arXiv Detail & Related papers (2022-10-20T21:32:01Z) - Double Thompson Sampling in Finite stochastic Games [10.559241770674724]
We consider the trade-off problem between exploration and exploitation under finite discounted Markov Decision Process.
We propose a double Thompson sampling reinforcement learning algorithm(DTS) to solve this kind of problem.
arXiv Detail & Related papers (2022-02-21T06:11:51Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - On Robust Optimal Transport: Computational Complexity, Low-rank
Approximation, and Barycenter Computation [14.80695185915604]
We consider two robust versions of optimal transport, named $textitRobust Semi-constrained Optimal Transport$ (RSOT) and $textitRobust Unconstrained Optimal Transport$ (ROT)
For both problems in the discrete settings, we propose Sinkhorn-based algorithms that produce $varepsilon$-approximations of RSOT and ROT in $widetildemathcalO(fracn2varepsilon)$ time.
arXiv Detail & Related papers (2021-02-13T03:55:52Z) - Improved Sample Complexity for Incremental Autonomous Exploration in
MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps.
DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.