Few-Shot Learning with Siamese Networks and Label Tuning
- URL: http://arxiv.org/abs/2203.14655v1
- Date: Mon, 28 Mar 2022 11:16:46 GMT
- Title: Few-Shot Learning with Siamese Networks and Label Tuning
- Authors: Thomas M\"uller and Guillermo P\'erez-Torr\'o and Marc Franco-Salvador
- Abstract summary: We show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative.
We introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings.
- Score: 5.006086647446482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of building text classifiers with little or no training
data, commonly known as zero and few-shot text classification. In recent years,
an approach based on neural textual entailment models has been found to give
strong results on a diverse range of tasks. In this work, we show that with
proper pre-training, Siamese Networks that embed texts and labels offer a
competitive alternative. These models allow for a large reduction in inference
cost: constant in the number of labels rather than linear. Furthermore, we
introduce label tuning, a simple and computationally efficient approach that
allows to adapt the models in a few-shot setup by only changing the label
embeddings. While giving lower performance than model fine-tuning, this
approach has the architectural advantage that a single encoder can be shared by
many different tasks.
Related papers
- Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - Neural Networks Against (and For) Self-Training: Classification with
Small Labeled and Large Unlabeled Sets [11.385682758047775]
One of the weaknesses of self-training is the semantic drift problem.
We reshape the role of pseudo-labels and create a hierarchical order of information.
A crucial step in self-training is to use the confidence prediction to select the best candidate pseudo-labels.
arXiv Detail & Related papers (2023-12-31T19:25:34Z) - Unlocking the Transferability of Tokens in Deep Models for Tabular Data [67.11727608815636]
Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks.
In this paper, we propose TabToken, a method aims at enhancing the quality of feature tokens.
We introduce a contrastive objective that regularizes the tokens, capturing the semantics within and across features.
arXiv Detail & Related papers (2023-10-23T17:53:09Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - Improving Model Training via Self-learned Label Representations [5.969349640156469]
We show that more sophisticated label representations are better for classification than the usual one-hot encoding.
We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task.
Our algorithm introduces negligible additional parameters and has a minimal computational overhead.
arXiv Detail & Related papers (2022-09-09T21:10:43Z) - A Simple Yet Effective Pretraining Strategy for Graph Few-shot Learning [38.66690010054665]
We propose a simple transductive fine-tuning based framework as a new paradigm for graph few-shot learning.
For pretraining, we propose a supervised contrastive learning framework with data augmentation strategies specific for few-shot node classification.
arXiv Detail & Related papers (2022-03-29T22:30:00Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z) - Learning from Noisy Labels for Entity-Centric Information Extraction [17.50856935207308]
We propose a simple co-regularization framework for entity-centric information extraction.
These models are jointly optimized with task-specific loss, and are regularized to generate similar predictions.
In the end, we can take any of the trained models for inference.
arXiv Detail & Related papers (2021-04-17T22:49:12Z) - Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels.
We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly.
This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.