Transductive Learning for Textual Few-Shot Classification in API-based
Embedding Models
- URL: http://arxiv.org/abs/2310.13998v1
- Date: Sat, 21 Oct 2023 12:47:10 GMT
- Title: Transductive Learning for Textual Few-Shot Classification in API-based
Embedding Models
- Authors: Pierre Colombo, Victor Pellegrain, Malik Boudiaf, Victor Storchan,
Myriam Tami, Ismail Ben Ayed, Celine Hudelot, Pablo Piantanida
- Abstract summary: Few-shot classification involves training a model to perform a new classification task with a handful of labeled data.
We introduce a scenario where the embedding of a pre-trained model is served through a gated API with compute-cost and data-privacy constraints.
We propose a transductive inference, a learning paradigm that has been overlooked by the NLP community.
- Score: 46.79078308022975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proprietary and closed APIs are becoming increasingly common to process
natural language, and are impacting the practical applications of natural
language processing, including few-shot classification. Few-shot classification
involves training a model to perform a new classification task with a handful
of labeled data. This paper presents three contributions. First, we introduce a
scenario where the embedding of a pre-trained model is served through a gated
API with compute-cost and data-privacy constraints. Second, we propose a
transductive inference, a learning paradigm that has been overlooked by the NLP
community. Transductive inference, unlike traditional inductive learning,
leverages the statistics of unlabeled data. We also introduce a new
parameter-free transductive regularizer based on the Fisher-Rao loss, which can
be used on top of the gated API embeddings. This method fully utilizes
unlabeled data, does not share any label with the third-party API provider and
could serve as a baseline for future research. Third, we propose an improved
experimental setting and compile a benchmark of eight datasets involving
multiclass classification in four different languages, with up to 151 classes.
We evaluate our methods using eight backbone models, along with an episodic
evaluation over 1,000 episodes, which demonstrate the superiority of
transductive inference over the standard inductive setting.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge.
We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks.
Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z) - Task-Specific Embeddings for Ante-Hoc Explainable Text Classification [6.671252951387647]
We propose an alternative training objective in which we learn task-specific embeddings of text.
Our proposed objective learns embeddings such that all texts that share the same target class label should be close together.
We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy.
arXiv Detail & Related papers (2022-11-30T19:56:25Z) - Class-incremental Novel Class Discovery [76.35226130521758]
We study the new task of class-incremental Novel Class Discovery (class-iNCD)
We propose a novel approach for class-iNCD which prevents forgetting of past information about the base classes.
Our experiments, conducted on three common benchmarks, demonstrate that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-18T13:49:27Z) - Distant finetuning with discourse relations for stance classification [55.131676584455306]
We propose a new method to extract data with silver labels from raw text to finetune a model for stance classification.
We also propose a 3-stage training framework where the noisy level in the data used for finetuning decreases over different stages.
Our approach ranks 1st among 26 competing teams in the stance classification track of the NLPCC 2021 shared task Argumentative Text Understanding for AI Debater.
arXiv Detail & Related papers (2022-04-27T04:24:35Z) - Dominant Set-based Active Learning for Text Classification and its
Application to Online Social Media [0.0]
We present a novel pool-based active learning method for the training of large unlabeled corpus with minimum annotation cost.
Our proposed method does not have any parameters to be tuned, making it dataset-independent.
Our method achieves a higher performance in comparison to the state-of-the-art active learning strategies.
arXiv Detail & Related papers (2022-01-28T19:19:03Z) - Learning to Generate Novel Classes for Deep Metric Learning [24.048915378172012]
We introduce a new data augmentation approach that synthesizes novel classes and their embedding vectors.
We implement this idea by learning and exploiting a conditional generative model, which, given a class label and a noise, produces a random embedding vector of the class.
Our proposed generator allows the loss to use richer class relations by augmenting realistic and diverse classes, resulting in better generalization to unseen samples.
arXiv Detail & Related papers (2022-01-04T06:55:19Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Class-Incremental Learning for Semantic Segmentation Re-Using Neither
Old Data Nor Old Labels [35.586031601299034]
We present a technique implementing class-incremental learning for semantic segmentation without using the labeled data the model was initially trained on.
We show how to overcome these problems with a novel class-incremental learning technique, which nonetheless requires labels only for the new classes.
We evaluate our method on the Cityscapes dataset, where we exceed the mIoU performance of all baselines by 3.5% absolute reaching a result.
arXiv Detail & Related papers (2020-05-12T21:03:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.