Low-rank Optimal Transport: Approximation, Statistics and Debiasing
- URL: http://arxiv.org/abs/2205.12365v1
- Date: Tue, 24 May 2022 20:51:37 GMT
- Title: Low-rank Optimal Transport: Approximation, Statistics and Debiasing
- Authors: Meyer Scetbon, Marco Cuturi
- Abstract summary: Low-rank optimal transport (LOT) approach advocated in citescetbon 2021lowrank
LOT is seen as a legitimate contender to entropic regularization when compared on properties of interest.
We target each of these areas in this paper in order to cement the impact of low-rank approaches in computational OT.
- Score: 51.50788603386766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The matching principles behind optimal transport (OT) play an increasingly
important role in machine learning, a trend which can be observed when OT is
used to disambiguate datasets in applications (e.g. single-cell genomics) or
used to improve more complex methods (e.g. balanced attention in transformers
or self-supervised learning). To scale to more challenging problems, there is a
growing consensus that OT requires solvers that can operate on millions, not
thousands, of points. The low-rank optimal transport (LOT) approach advocated
in \cite{scetbon2021lowrank} holds several promises in that regard, and was
shown to complement more established entropic regularization approaches, being
able to insert itself in more complex pipelines, such as quadratic OT. LOT
restricts the search for low-cost couplings to those that have a
low-nonnegative rank, yielding linear time algorithms in cases of interest.
However, these promises can only be fulfilled if the LOT approach is seen as a
legitimate contender to entropic regularization when compared on properties of
interest, where the scorecard typically includes theoretical properties
(statistical bounds, relation to other methods) or practical aspects
(debiasing, hyperparameter tuning, initialization). We target each of these
areas in this paper in order to cement the impact of low-rank approaches in
computational OT.
Related papers
- Sample-Efficient Clustering and Conquer Procedures for Parallel
Large-Scale Ranking and Selection [0.0]
In parallel computing environments, correlation-based clustering can achieve an $mathcalO(p)$ sample complexity reduction rate.
In large-scale AI applications such as neural architecture search, a screening-free version of our procedure surprisingly surpasses fully-sequential benchmarks in terms of sample efficiency.
arXiv Detail & Related papers (2024-02-03T15:56:03Z) - Energy-Guided Continuous Entropic Barycenter Estimation for General Costs [95.33926437521046]
We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions.
Our approach is built upon the dual reformulation of the EOT problem based on weak OT.
arXiv Detail & Related papers (2023-10-02T11:24:36Z) - Entropic Neural Optimal Transport via Diffusion Processes [105.34822201378763]
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions.
Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schr"odinger Bridge problem.
In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step.
arXiv Detail & Related papers (2022-11-02T14:35:13Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning [92.05556163518999]
MARL exacerbates matters by imposing various constraints on communication and observability.
For value-based methods, it poses challenges in accurately representing the optimal value function.
For policy gradient methods, it makes training the critic difficult and exacerbates the problem of the lagging critic.
We show that from a learning theory perspective, both problems can be addressed by accurately representing the associated action-value function.
arXiv Detail & Related papers (2021-05-31T23:08:05Z) - Fast Distributionally Robust Learning with Variance Reduced Min-Max
Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications.
Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients.
We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z) - Unbalanced minibatch Optimal Transport; applications to Domain
Adaptation [8.889304968879163]
Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions.
We argue that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behavior.
Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.
arXiv Detail & Related papers (2021-03-05T11:15:47Z) - Minibatch optimal transport distances; analysis and applications [9.574645423576932]
Optimal transport distances have become a classic tool to compare probability distributions and have found many applications in machine learning.
A common workaround is to compute these distances on minibatches to average the outcome of several smaller optimal transport problems.
We propose in this paper an extended analysis of this practice, which effects were previously studied in restricted cases.
arXiv Detail & Related papers (2021-01-05T21:29:31Z) - Regularized Optimal Transport is Ground Cost Adversarial [34.81915836064636]
We show that regularization of the optimal transport problem can be interpreted as ground cost adversarial.
This gives access to a robust dissimilarity measure on the ground space, which can in turn be used in other applications.
arXiv Detail & Related papers (2020-02-10T17:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.