Sparsity-Constrained Optimal Transport
- URL: http://arxiv.org/abs/2209.15466v2
- Date: Fri, 14 Apr 2023 13:24:43 GMT
- Title: Sparsity-Constrained Optimal Transport
- Authors: Tianlin Liu, Joan Puigcerver, Mathieu Blondel
- Abstract summary: Regularized optimal transport is now increasingly used as a loss or as a matching layer in neural networks.
We propose a new approach for OT with explicit cardinality constraints on the transportation plan.
Our method can be thought as a middle ground between unregularized OT (recovered in the case $k$) and quadratically-regularized OT (recovered when $k$ is large enough)
- Score: 27.76137474217754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regularized optimal transport (OT) is now increasingly used as a loss or as a
matching layer in neural networks. Entropy-regularized OT can be computed using
the Sinkhorn algorithm but it leads to fully-dense transportation plans,
meaning that all sources are (fractionally) matched with all targets. To
address this issue, several works have investigated quadratic regularization
instead. This regularization preserves sparsity and leads to unconstrained and
smooth (semi) dual objectives, that can be solved with off-the-shelf gradient
methods. Unfortunately, quadratic regularization does not give direct control
over the cardinality (number of nonzeros) of the transportation plan. We
propose in this paper a new approach for OT with explicit cardinality
constraints on the transportation plan. Our work is motivated by an application
to sparse mixture of experts, where OT can be used to match input tokens such
as image patches with expert models such as neural networks. Cardinality
constraints ensure that at most $k$ tokens are matched with an expert, which is
crucial for computational performance reasons. Despite the nonconvexity of
cardinality constraints, we show that the corresponding (semi) dual problems
are tractable and can be solved with first-order gradient methods. Our method
can be thought as a middle ground between unregularized OT (recovered in the
limit case $k=1$) and quadratically-regularized OT (recovered when $k$ is large
enough). The smoothness of the objectives increases as $k$ increases, giving
rise to a trade-off between convergence speed and sparsity of the optimal plan.
Related papers
- Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization [38.67914746910537]
We prove an $mathcalO(t-1)$ lower bound rate for the OT map, using the similarity between Laguerre cells estimation and density support estimation.
To nearly achieve the desired fast rate, we design an entropic regularization scheme decreasing with the number of samples.
arXiv Detail & Related papers (2024-05-23T11:46:03Z) - Anchor Space Optimal Transport: Accelerating Batch Processing of
Multiple OT Problems [13.846014191157405]
We propose a translated OT problem designated as the anchor space optimal transport (ASOT) problem.
For the proposed ASOT problem, the distributions will be mapped into a shared anchor point space, which learns the potential common characteristics.
Based on the proposed ASOT, the Wasserstein distance error to the original OT problem is proven to be bounded by ground cost errors.
arXiv Detail & Related papers (2023-10-24T18:55:12Z) - OT-Net: A Reusable Neural Optimal Transport Solver [26.153287448650126]
A novel reusable neural OT solver OT-Net is presented.
OT-Net learns Brenier's height representation via the neural network to obtain its potential.
It then gained the OT map by computing the gradient of the potential.
arXiv Detail & Related papers (2023-06-14T04:11:38Z) - Unbalanced Low-rank Optimal Transport Solvers [38.79369155558385]
We propose algorithms to implement extensions for the linear OT problem and its Fused-Gromov-Wasserstein generalization.
The goal of this paper is to merge these two strains, to achieve the promise of textitboth versatile/scalable unbalanced/low-rank OT solvers.
arXiv Detail & Related papers (2023-05-31T10:39:51Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the
JKO Scheme [60.79981399724534]
A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution.
We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $alpha$.
arXiv Detail & Related papers (2022-11-30T05:53:21Z) - Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport? [138.1080446991979]
Wasserstein Generative Adversarial Networks (WGANs) are the popular generative models built on the theory of Optimal Transport (OT) and the Kantorovich duality.
Despite the success of WGANs, it is still unclear how well the underlying OT dual solvers approximate the OT cost (Wasserstein-1 distance, $mathbbW_1$) and the OT gradient needed to update the generator.
We construct 1-Lipschitz functions and use them to build ray monotone transport plans. This strategy yields pairs of continuous benchmark distributions with the analytically known OT plan, OT cost and OT
arXiv Detail & Related papers (2022-06-15T19:07:46Z) - Low-rank Optimal Transport: Approximation, Statistics and Debiasing [51.50788603386766]
Low-rank optimal transport (LOT) approach advocated in citescetbon 2021lowrank
LOT is seen as a legitimate contender to entropic regularization when compared on properties of interest.
We target each of these areas in this paper in order to cement the impact of low-rank approaches in computational OT.
arXiv Detail & Related papers (2022-05-24T20:51:37Z) - Lagrangian Decomposition for Neural Network Verification [148.0448557991349]
A fundamental component of neural network verification is the computation of bounds on the values their outputs can take.
We propose a novel approach based on Lagrangian Decomposition.
We show that we obtain bounds comparable with off-the-shelf solvers in a fraction of their running time.
arXiv Detail & Related papers (2020-02-24T17:55:10Z) - Regularized Optimal Transport is Ground Cost Adversarial [34.81915836064636]
We show that regularization of the optimal transport problem can be interpreted as ground cost adversarial.
This gives access to a robust dissimilarity measure on the ground space, which can in turn be used in other applications.
arXiv Detail & Related papers (2020-02-10T17:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.