Efficient Robust Optimal Transport with Application to Multi-Label
Classification
- URL: http://arxiv.org/abs/2010.11852v2
- Date: Thu, 7 Oct 2021 07:04:47 GMT
- Title: Efficient Robust Optimal Transport with Application to Multi-Label
Classification
- Authors: Pratik Jawanpuria, N T V Satyadev, and Bamdev Mishra
- Abstract summary: We model the feature-feature relationship via a symmetric positive semi-definite Mahalanobis metric in the OT cost function.
We view the resulting optimization problem as a non-linear OT problem, which we solve using the Frank-Wolfe algorithm.
Empirical results on the discriminative learning setting, such as tag prediction and multi-class classification, illustrate the good performance of our approach.
- Score: 12.521494095948068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal transport (OT) is a powerful geometric tool for comparing two
distributions and has been employed in various machine learning applications.
In this work, we propose a novel OT formulation that takes feature correlations
into account while learning the transport plan between two distributions. We
model the feature-feature relationship via a symmetric positive semi-definite
Mahalanobis metric in the OT cost function. For a certain class of regularizers
on the metric, we show that the optimization strategy can be considerably
simplified by exploiting the problem structure. For high-dimensional data, we
additionally propose suitable low-dimensional modeling of the Mahalanobis
metric. Overall, we view the resulting optimization problem as a non-linear OT
problem, which we solve using the Frank-Wolfe algorithm. Empirical results on
the discriminative learning setting, such as tag prediction and multi-class
classification, illustrate the good performance of our approach.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data [23.661713049508375]
We propose an algorithm that learns over a submanifold in the setting of a client.
We show that our proposed algorithm converges sub-ly to a neighborhood of a first-order optimal solution by using a novel analysis.
arXiv Detail & Related papers (2024-06-12T17:53:28Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Unbalanced Optimal Transport meets Sliced-Wasserstein [11.44982599214965]
We propose two new loss functions based on the idea of slicing unbalanced OT, and study their induced topology and statistical properties.
We show that the resulting methodology is modular as it encompasses and extends prior related work.
arXiv Detail & Related papers (2023-06-12T15:15:00Z) - Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives [10.514866749547558]
We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets.
A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks.
Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar.
This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks.
arXiv Detail & Related papers (2022-12-16T19:52:25Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Robust Optimal Transport with Applications in Generative Modeling and
Domain Adaptation [120.69747175899421]
Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation.
We propose a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications.
Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions.
arXiv Detail & Related papers (2020-10-12T17:13:40Z) - Feature Robust Optimal Transport for High-dimensional Data [125.04654605998618]
We propose feature-robust optimal transport (FROT) for high-dimensional data, which solves high-dimensional OT problems using feature selection to avoid the curse of dimensionality.
We show that the FROT algorithm achieves state-of-the-art performance in real-world semantic correspondence datasets.
arXiv Detail & Related papers (2020-05-25T14:07:16Z) - Learning Cost Functions for Optimal Transport [44.64193016158591]
Inverse optimal transport (OT) refers to the problem of learning the cost function for OT from observed transport plan or its samples.
We derive an unconstrained convex optimization formulation of the inverse OT problem, which can be further augmented by any customizable regularization.
arXiv Detail & Related papers (2020-02-22T07:27:17Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.