Open-Set Likelihood Maximization for Few-Shot Learning
- URL: http://arxiv.org/abs/2301.08390v2
- Date: Fri, 19 May 2023 13:51:47 GMT
- Title: Open-Set Likelihood Maximization for Few-Shot Learning
- Authors: Malik Boudiaf, Etienne Bennequin, Myriam Tami, Antoine Toubhans, Pablo
Piantanida, C\'eline Hudelot, Ismail Ben Ayed
- Abstract summary: We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying instances among a set of classes for which we only have a few labeled samples.
We explore the popular transductive setting, which leverages the unlabelled query instances at inference.
Motivated by the observation that existing transductive methods perform poorly in open-set scenarios, we propose a generalization of the maximum likelihood principle.
- Score: 36.97433312193586
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying
instances among a set of classes for which we only have a few labeled samples,
while simultaneously detecting instances that do not belong to any known class.
We explore the popular transductive setting, which leverages the unlabelled
query instances at inference. Motivated by the observation that existing
transductive methods perform poorly in open-set scenarios, we propose a
generalization of the maximum likelihood principle, in which latent scores
down-weighing the influence of potential outliers are introduced alongside the
usual parametric model. Our formulation embeds supervision constraints from the
support set and additional penalties discouraging overconfident predictions on
the query set. We proceed with a block-coordinate descent, with the latent
scores and parametric model co-optimized alternately, thereby benefiting from
each other. We call our resulting formulation \textit{Open-Set Likelihood
Optimization} (OSLO). OSLO is interpretable and fully modular; it can be
applied on top of any pre-trained model seamlessly. Through extensive
experiments, we show that our method surpasses existing inductive and
transductive methods on both aspects of open-set recognition, namely inlier
classification and outlier detection.
Related papers
- Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization [7.378582040635655]
Current deep learning approaches rely on generative models that yield exact sample likelihoods.
This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models.
We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.
arXiv Detail & Related papers (2024-06-03T17:55:02Z) - Hypothesis Testing for Class-Conditional Noise Using Local Maximum
Likelihood [1.8798171797988192]
In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question.
In this paper we show how similar procedures can be followed when the underlying model is a product of Local Maximum Likelihood Estimation.
This different view allows for wider applicability of the tests by offering users access to a richer model class.
arXiv Detail & Related papers (2023-12-15T22:14:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Model-Agnostic Few-Shot Open-Set Recognition [36.97433312193586]
We tackle the Few-Shot Open-Set Recognition (FSOSR) problem.
We focus on developing model-agnostic inference methods that can be plugged into any existing model.
We introduce an Open Set Transductive Information Maximization method OSTIM.
arXiv Detail & Related papers (2022-06-18T16:27:59Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.