Unbalanced minibatch Optimal Transport; applications to Domain
Adaptation
- URL: http://arxiv.org/abs/2103.03606v1
- Date: Fri, 5 Mar 2021 11:15:47 GMT
- Title: Unbalanced minibatch Optimal Transport; applications to Domain
Adaptation
- Authors: Kilian Fatras, Thibault S\'ejourn\'e, Nicolas Courty, R\'emi Flamary
- Abstract summary: Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions.
We argue that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behavior.
Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.
- Score: 8.889304968879163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimal transport distances have found many applications in machine learning
for their capacity to compare non-parametric probability distributions. Yet
their algorithmic complexity generally prevents their direct use on large scale
datasets. Among the possible strategies to alleviate this issue, practitioners
can rely on computing estimates of these distances over subsets of data, {\em
i.e.} minibatches. While computationally appealing, we highlight in this paper
some limits of this strategy, arguing it can lead to undesirable smoothing
effects. As an alternative, we suggest that the same minibatch strategy coupled
with unbalanced optimal transport can yield more robust behavior. We discuss
the associated theoretical properties, such as unbiased estimators, existence
of gradients and concentration bounds. Our experimental study shows that in
challenging problems associated to domain adaptation, the use of unbalanced
optimal transport leads to significantly better results, competing with or
surpassing recent baselines.
Related papers
- OTClean: Data Cleaning for Conditional Independence Violations using
Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints.
We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z) - Budget-Constrained Bounds for Mini-Batch Estimation of Optimal Transport [35.440243358517066]
We introduce novel families of upper and lower bounds for the Optimal Transport problem constructed by aggregating solutions of mini-batch OT problems.
The upper bound family contains traditional mini-batch averaging at one extreme and a tight bound found by optimal coupling of mini-batches at the other.
Through various experiments, we explore the trade-off between computational budget and bound tightness and show the usefulness of these bounds in computer vision applications.
arXiv Detail & Related papers (2022-10-24T22:12:17Z) - InfoOT: Information Maximizing Optimal Transport [58.72713603244467]
InfoOT is an information-theoretic extension of optimal transport.
It maximizes the mutual information between domains while minimizing geometric distances.
This formulation yields a new projection method that is robust to outliers and generalizes to unseen samples.
arXiv Detail & Related papers (2022-10-06T18:55:41Z) - Unbalanced CO-Optimal Transport [16.9451175221198]
CO-optimal transport (COOT) takes this comparison further by inferring an alignment between features as well.
We show that it is sensitive to outliers that are omnipresent in real-world data.
This prompts us to propose unbalanced COOT for which we provably show its robustness to noise.
arXiv Detail & Related papers (2022-05-30T08:43:19Z) - Low-rank Optimal Transport: Approximation, Statistics and Debiasing [51.50788603386766]
Low-rank optimal transport (LOT) approach advocated in citescetbon 2021lowrank
LOT is seen as a legitimate contender to entropic regularization when compared on properties of interest.
We target each of these areas in this paper in order to cement the impact of low-rank approaches in computational OT.
arXiv Detail & Related papers (2022-05-24T20:51:37Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2
Benchmark [133.46066694893318]
We evaluate the performance of neural network-based solvers for optimal transport.
We find that existing solvers do not recover optimal transport maps even though they perform well in downstream tasks.
arXiv Detail & Related papers (2021-06-03T15:59:28Z) - Minibatch optimal transport distances; analysis and applications [9.574645423576932]
Optimal transport distances have become a classic tool to compare probability distributions and have found many applications in machine learning.
A common workaround is to compute these distances on minibatches to average the outcome of several smaller optimal transport problems.
We propose in this paper an extended analysis of this practice, which effects were previously studied in restricted cases.
arXiv Detail & Related papers (2021-01-05T21:29:31Z) - Robust Optimal Transport with Applications in Generative Modeling and
Domain Adaptation [120.69747175899421]
Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation.
We propose a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications.
Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions.
arXiv Detail & Related papers (2020-10-12T17:13:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.