Towards Practical Few-Shot Query Sets: Transductive Minimum Description
Length Inference
- URL: http://arxiv.org/abs/2210.14545v1
- Date: Wed, 26 Oct 2022 08:06:57 GMT
- Title: Towards Practical Few-Shot Query Sets: Transductive Minimum Description
Length Inference
- Authors: S\'egol\`ene Martin (OPIS, CVN), Malik Boudiaf (ETS), Emilie
Chouzenoux (OPIS, CVN), Jean-Christophe Pesquet (OPIS, CVN), Ismail Ben Ayed
(ETS)
- Abstract summary: We introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task.
Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard few-shot benchmarks are often built upon simplifying assumptions on
the query sets, which may not always hold in practice. In particular, for each
task at testing time, the classes effectively present in the unlabeled query
set are known a priori, and correspond exactly to the set of classes
represented in the labeled support set. We relax these assumptions and extend
current benchmarks, so that the query-set classes of a given task are unknown,
but just belong to a much larger set of possible classes. Our setting could be
viewed as an instance of the challenging yet practical problem of extremely
imbalanced K-way classification, K being much larger than the values typically
used in standard benchmarks, and with potentially irrelevant supervision from
the support set. Expectedly, our setting incurs drops in the performances of
state-of-the-art methods. Motivated by these observations, we introduce a
PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances
data-fitting accuracy and model complexity for a given few-shot task, under
supervision constraints from the support set. Our constrained MDL-like
objective promotes competition among a large set of possible classes,
preserving only effective classes that befit better the data of a few-shot
task. It is hyperparameter free, and could be applied on top of any base-class
training. Furthermore, we derive a fast block coordinate descent algorithm for
optimizing our objective, with convergence guarantee, and a linear
computational complexity at each iteration. Comprehensive experiments over the
standard few-shot datasets and the more realistic and challenging i-Nat dataset
show highly competitive performances of our method, more so when the numbers of
possible classes in the tasks increase. Our code is publicly available at
https://github.com/SegoleneMartin/PADDLE.
Related papers
- Mitigating Word Bias in Zero-shot Prompt-based Classifiers [55.60306377044225]
We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
arXiv Detail & Related papers (2023-09-10T10:57:41Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - A Simple and Effective Framework for Strict Zero-Shot Hierarchical
Classification [23.109264015761873]
Large language models (LLMs) have achieved strong performance on benchmark tasks, especially in zero or few-shot settings.
We propose a more indicative long-tail prediction task for hierarchical datasets.
Our method does not require any updates, a resource-intensive process and achieves strong performance across multiple datasets.
arXiv Detail & Related papers (2023-05-24T16:04:26Z) - Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery [76.63807209414789]
We challenge the status quo in class-iNCD and propose a learning paradigm where class discovery occurs continuously and truly unsupervisedly.
We propose simple baselines, composed of a frozen PTM backbone and a learnable linear classifier, that are not only simple to implement but also resilient under longer learning scenarios.
arXiv Detail & Related papers (2023-03-28T13:47:16Z) - Complementary Labels Learning with Augmented Classes [22.460256396941528]
Complementary Labels Learning (CLL) arises in many real-world tasks such as private questions classification and online learning.
We propose a novel problem setting called Complementary Labels Learning with Augmented Classes (CLLAC)
By using unlabeled data, we propose an unbiased estimator of classification risk for CLLAC, which is guaranteed to be provably consistent.
arXiv Detail & Related papers (2022-11-19T13:55:27Z) - Transductive Few-Shot Learning: Clustering is All You Need? [31.21306826132773]
We investigate a general formulation for transive few-shot learning, which integrates prototype-based objectives.
We find that our method yields competitive performances, in term of accuracy and optimization, while scaling up to large problems.
Surprisingly, we find that our general model already achieve competitive performances in comparison to the state-of-the-art learning.
arXiv Detail & Related papers (2021-06-16T16:14:01Z) - SetConv: A New Approach for Learning from Imbalanced Data [29.366843553056594]
We propose a set convolution operation and an episodic training strategy to extract a single representative for each class.
We prove that our proposed algorithm is permutation-invariant despite the order of inputs.
arXiv Detail & Related papers (2021-04-03T22:33:30Z) - Revisiting Deep Local Descriptor for Improved Few-Shot Classification [56.74552164206737]
We show how one can improve the quality of embeddings by leveraging textbfDense textbfClassification and textbfAttentive textbfPooling.
We suggest to pool feature maps by applying attentive pooling instead of the widely used global average pooling (GAP) to prepare embeddings for few-shot classification.
arXiv Detail & Related papers (2021-03-30T00:48:28Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z) - Laplacian Regularized Few-Shot Learning [35.381119443377195]
We propose a transductive Laplacian-regularized inference for few-shot tasks.
Our inference does not re-train the base model, and can be viewed as a graph clustering of the query set.
Our LaplacianShot consistently outperforms state-of-the-art methods by significant margins across different models.
arXiv Detail & Related papers (2020-06-28T02:17:52Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.