Related papers: Optimal transport framework for efficient prototype selection

Optimal transport framework for efficient prototype selection

URL: http://arxiv.org/abs/2103.10159v1
Date: Thu, 18 Mar 2021 10:50:14 GMT
Title: Optimal transport framework for efficient prototype selection
Authors: Karthik S. Gurumoorthy and Pratik Jawanpuria and Bamdev Mishra
Abstract summary: We develop an optimal transport (OT) based framework to select informative examples that best represent a given target dataset. We show that our objective function enjoys a key property of submodularity and propose a parallelizable greedy method that is both computationally fast and possess deterministic approximation guarantees.
Score: 21.620708125860066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Summarizing data via representative examples is an important problem in several machine learning applications where human understanding of the learning models and underlying data distribution is essential for decision making. In this work, we develop an optimal transport (OT) based framework to select informative prototypical examples that best represent a given target dataset. We model the prototype selection problem as learning a sparse (empirical) probability distribution having minimum OT distance from the target distribution. The learned probability measure supported on the chosen prototypes directly corresponds to their importance in representing and summarizing the target data. We show that our objective function enjoys a key property of submodularity and propose a parallelizable greedy method that is both computationally fast and possess deterministic approximation guarantees. Empirical results on several real world benchmarks illustrate the efficacy of our approach.

Related papers

Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations [19.25205110583291]
A critical bottleneck is selecting the most relevant data to maximize task-specific performance.<n>Existing data selection approaches include unstable influence-based methods and more stable distribution alignment methods.<n>We introduce a dedicated similarity metric for this space to better identify task-relevant data.
arXiv Detail & Related papers (2025-03-19T11:35:57Z)
On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling on a continuous domain for the data prediction task of (multimodal) self-supervised representation learning. We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning. We propose a novel non-parametric method for approximating the sum of conditional probability densities required by MIS.
arXiv Detail & Related papers (2024-10-11T18:02:46Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation [9.359714425373616]
Empirical risk often performs poorly when the distribution of the target domain differs from those of source domains. We develop an unsupervised domain adaptation approach that leverages labeled data from multiple source domains and unlabeled data from the target domain.
arXiv Detail & Related papers (2023-09-05T13:19:40Z)
Evaluating Representations with Readout Model Switching [19.907607374144167]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric. We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions. The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z)
Predicting Out-of-Distribution Error with Confidence Optimal Transport [17.564313038169434]
We present a simple yet effective method to predict a model's performance on an unknown distribution without any addition annotation. We show that our method, Confidence Optimal Transport (COT), provides robust estimates of a model's performance on a target domain. Despite its simplicity, our method achieves state-of-the-art results on three benchmark datasets and outperforms existing methods by a large margin.
arXiv Detail & Related papers (2023-02-10T02:27:13Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Multi-Task Learning on Networks [0.0]
Multi-objective optimization problems arising in the multi-task learning context have specific features and require adhoc methods. In this thesis the solutions in the Input Space are represented as probability distributions encapsulating the knowledge contained in the function evaluations. In this space of probability distributions, endowed with the metric given by the Wasserstein distance, a new algorithm MOEA/WST can be designed in which the model is not directly on the objective function.
arXiv Detail & Related papers (2021-12-07T09:13:10Z)
Deep Learning with Multiple Data Set: A Weighted Goal Programming Approach [2.7393821783237184]
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. Deep Learning models require plenty of resources, and distributed training is needed. This paper presents a Multicriteria approach for distributed learning.
arXiv Detail & Related papers (2021-11-27T07:10:25Z)
Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model. We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z)
Conservative Objective Models for Effective Offline Model-Based Optimization [78.19085445065845]
Computational design problems arise in a number of settings, from synthetic biology to computer architectures. We propose a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution inputs. COMs are simple to implement and outperform a number of existing methods on a wide range of MBO problems.
arXiv Detail & Related papers (2021-07-14T17:55:28Z)
Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation. We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution. We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.