Related papers: Sliced Multi-Marginal Optimal Transport

Sliced Multi-Marginal Optimal Transport

URL: http://arxiv.org/abs/2102.07115v1
Date: Sun, 14 Feb 2021 09:58:47 GMT
Title: Sliced Multi-Marginal Optimal Transport
Authors: Samuel Cohen, K S Sesh Kumar, Marc Peter Deisenroth
Abstract summary: We study multi-marginal optimal transport, a generalization of optimal transport that allows us to define discrepancies between multiple measures. We show that computing the sliced multi-marginal discrepancy is massively scalable for a large number of probability measures with support as large as $107$ samples.
Score: 21.82052188474956
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study multi-marginal optimal transport, a generalization of optimal transport that allows us to define discrepancies between multiple measures. It provides a framework to solve multi-task learning problems and to perform barycentric averaging. However, multi-marginal distances between multiple measures are typically challenging to compute because they require estimating a transport plan with $N^P$ variables. In this paper, we address this issue in the following way: 1) we efficiently solve the one-dimensional multi-marginal Monge-Wasserstein problem for a classical cost function in closed form, and 2) we propose a higher-dimensional multi-marginal discrepancy via slicing and study its generalized metric properties. We show that computing the sliced multi-marginal discrepancy is massively scalable for a large number of probability measures with support as large as $10^7$ samples. Our approach can be applied to solving problems such as barycentric averaging, multi-task density estimation and multi-task reinforcement learning.

Related papers

Continual Multimodal Contrastive Learning [70.60542106731813]
Multimodal contrastive learning (MCL) advances in aligning different modalities and generating multimodal representations in a joint space. However, a critical yet often overlooked challenge remains: multimodal data is rarely collected in a single process, and training from scratch is computationally expensive. In this paper, we formulate CMCL through two specialized principles of stability and plasticity. We theoretically derive a novel optimization-based method, which projects updated gradients from dual sides onto subspaces where any gradient is prevented from interfering with the previously learned knowledge.
arXiv Detail & Related papers (2025-03-19T07:57:08Z)
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport [84.51977664336056]
We propose a novel, scalable approach for estimating the textitrobust continuous barycenter. Our method is framed as a $min$-$max$ optimization problem and is adaptable to textitgeneral cost function.
arXiv Detail & Related papers (2024-10-04T23:27:33Z)
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP) MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z)
Estimating Barycenters of Distributions with Neural Optimal Transport [93.28746685008093]
We propose a new scalable approach for solving the Wasserstein barycenter problem. Our methodology is based on the recent Neural OT solver. We also establish theoretical error bounds for our proposed approach.
arXiv Detail & Related papers (2024-02-06T09:17:07Z)
Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity [41.772562538698395]
Multitask learning methods improve efficiency by leveraging commonalities across datasets. We study multitask linear regression and contextual bandits under sparse heterogeneity. We show that our methods are minimax optimal by providing a number of lower bounds.
arXiv Detail & Related papers (2023-06-09T22:48:13Z)
Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives [10.514866749547558]
We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets. A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks. Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar. This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks.
arXiv Detail & Related papers (2022-12-16T19:52:25Z)
Approximative Algorithms for Multi-Marginal Optimal Transport and Free-Support Wasserstein Barycenters [0.0]
We present two algorithms to approximate the solution of multi-marginal optimal transport (MOT) with squared Euclidean costs for $N$ discrete probability measures. They are fast, memory-efficient and easy to implement and can be used with any sparse OT solver as a black box.
arXiv Detail & Related papers (2022-02-02T10:59:54Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
On the complexity of the optimal transport problem with graph-structured cost [9.24979291231758]
Multi-marginal optimal transport (MOT) is a generalization of optimal transport to multiple marginals. The usage of MOT has been largely impeded by its computational complexity which scales exponentially in the number of marginals.
arXiv Detail & Related papers (2021-10-01T19:29:59Z)
On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity [11.280177531118206]
We study the multi-marginal partial optimal transport (POT) problem between $m$ discrete (unbalanced) measures with at most $n$ supports. We first prove that we can obtain two equivalence forms of the multimarginal POT problem in terms of the multimarginal optimal transport problem via novel extensions of cost tensor. We demonstrate that the ApproxMPOT algorithm can approximate the optimal value of multimarginal POT problem with a computational complexity upper bound of the order $tildemathcalO(m3(n+1)m/ var
arXiv Detail & Related papers (2021-08-18T06:46:59Z)
Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution [8.465228064780748]
We prove that computing the Wasserstein distance between a discrete probability measure supported on two points is already #P-hard. We introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions. We show that smoothing the dual objective function is equivalent to regularizing the primal objective function.
arXiv Detail & Related papers (2021-03-10T18:53:59Z)
Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time. A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal. We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.