Related papers: Sparsity-Constrained Optimal Transport

Sparsity-Constrained Optimal Transport

URL: http://arxiv.org/abs/2209.15466v2
Date: Fri, 14 Apr 2023 13:24:43 GMT
Title: Sparsity-Constrained Optimal Transport
Authors: Tianlin Liu, Joan Puigcerver, Mathieu Blondel
Abstract summary: Regularized optimal transport is now increasingly used as a loss or as a matching layer in neural networks. We propose a new approach for OT with explicit cardinality constraints on the transportation plan. Our method can be thought as a middle ground between unregularized OT (recovered in the case $k$) and quadratically-regularized OT (recovered when $k$ is large enough)
Score: 27.76137474217754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Regularized optimal transport (OT) is now increasingly used as a loss or as a matching layer in neural networks. Entropy-regularized OT can be computed using the Sinkhorn algorithm but it leads to fully-dense transportation plans, meaning that all sources are (fractionally) matched with all targets. To address this issue, several works have investigated quadratic regularization instead. This regularization preserves sparsity and leads to unconstrained and smooth (semi) dual objectives, that can be solved with off-the-shelf gradient methods. Unfortunately, quadratic regularization does not give direct control over the cardinality (number of nonzeros) of the transportation plan. We propose in this paper a new approach for OT with explicit cardinality constraints on the transportation plan. Our work is motivated by an application to sparse mixture of experts, where OT can be used to match input tokens such as image patches with expert models such as neural networks. Cardinality constraints ensure that at most $k$ tokens are matched with an expert, which is crucial for computational performance reasons. Despite the nonconvexity of cardinality constraints, we show that the corresponding (semi) dual problems are tractable and can be solved with first-order gradient methods. Our method can be thought as a middle ground between unregularized OT (recovered in the limit case $k=1$) and quadratically-regularized OT (recovered when $k$ is large enough). The smoothness of the objectives increases as $k$ increases, giving rise to a trade-off between convergence speed and sparsity of the optimal plan.

Related papers

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization. We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution. Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
arXiv Detail & Related papers (2025-02-07T15:57:56Z)
A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers [65.28989155951132]
In this paper, we establish upper bounds on the generalization error of an approximate OT map recovered by the minimax quadratic OT solver. While our analysis focuses on the quadratic OT, we believe that similar bounds could be derived for more general OT formulations.
arXiv Detail & Related papers (2025-02-03T12:37:20Z)
Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization [38.67914746910537]
We prove an $mathcalO(t-1)$ lower bound rate for the OT map, using the similarity between Laguerre cells estimation and density support estimation. To nearly achieve the desired fast rate, we design an entropic regularization scheme decreasing with the number of samples.
arXiv Detail & Related papers (2024-05-23T11:46:03Z)
Anchor Space Optimal Transport: Accelerating Batch Processing of Multiple OT Problems [13.846014191157405]
We propose a translated OT problem designated as the anchor space optimal transport (ASOT) problem. For the proposed ASOT problem, the distributions will be mapped into a shared anchor point space, which learns the potential common characteristics. Based on the proposed ASOT, the Wasserstein distance error to the original OT problem is proven to be bounded by ground cost errors.
arXiv Detail & Related papers (2023-10-24T18:55:12Z)
OT-Net: A Reusable Neural Optimal Transport Solver [26.153287448650126]
A novel reusable neural OT solver OT-Net is presented. OT-Net learns Brenier's height representation via the neural network to obtain its potential. It then gained the OT map by computing the gradient of the potential.
arXiv Detail & Related papers (2023-06-14T04:11:38Z)
Unbalanced Low-rank Optimal Transport Solvers [38.79369155558385]
We propose algorithms to implement extensions for the linear OT problem and its Fused-Gromov-Wasserstein generalization. The goal of this paper is to merge these two strains, to achieve the promise of textitboth versatile/scalable unbalanced/low-rank OT solvers.
arXiv Detail & Related papers (2023-05-31T10:39:51Z)
Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems. We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z)
Robust computation of optimal transport by $\beta$-potential regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z)
Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the JKO Scheme [60.79981399724534]
A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $alpha$.
arXiv Detail & Related papers (2022-11-30T05:53:21Z)
Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport? [138.1080446991979]
Wasserstein Generative Adversarial Networks (WGANs) are the popular generative models built on the theory of Optimal Transport (OT) and the Kantorovich duality. Despite the success of WGANs, it is still unclear how well the underlying OT dual solvers approximate the OT cost (Wasserstein-1 distance, $mathbbW_1$) and the OT gradient needed to update the generator. We construct 1-Lipschitz functions and use them to build ray monotone transport plans. This strategy yields pairs of continuous benchmark distributions with the analytically known OT plan, OT cost and OT
arXiv Detail & Related papers (2022-06-15T19:07:46Z)
Block-coordinate Frank-Wolfe algorithm and convergence analysis for semi-relaxed optimal transport problem [20.661025590877774]
We propose a fast block-coordinate Frank-Wolfe (BCFW) algorithm for a convex semi-relaxed optimal transport (OT) problem. Numerical experiments have demonstrated that our proposed algorithms are effective for color transfer and surpass state-of-the-art algorithms.
arXiv Detail & Related papers (2022-05-27T05:54:45Z)
Low-rank Optimal Transport: Approximation, Statistics and Debiasing [51.50788603386766]
Low-rank optimal transport (LOT) approach advocated in citescetbon 2021lowrank LOT is seen as a legitimate contender to entropic regularization when compared on properties of interest. We target each of these areas in this paper in order to cement the impact of low-rank approaches in computational OT.
arXiv Detail & Related papers (2022-05-24T20:51:37Z)
Lagrangian Decomposition for Neural Network Verification [148.0448557991349]
A fundamental component of neural network verification is the computation of bounds on the values their outputs can take. We propose a novel approach based on Lagrangian Decomposition. We show that we obtain bounds comparable with off-the-shelf solvers in a fraction of their running time.
arXiv Detail & Related papers (2020-02-24T17:55:10Z)
Regularized Optimal Transport is Ground Cost Adversarial [34.81915836064636]
We show that regularization of the optimal transport problem can be interpreted as ground cost adversarial. This gives access to a robust dissimilarity measure on the ground space, which can in turn be used in other applications.
arXiv Detail & Related papers (2020-02-10T17:28:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.