Related papers: Few-Shot Learning with Siamese Networks and Label Tuning

Few-Shot Learning with Siamese Networks and Label Tuning

URL: http://arxiv.org/abs/2203.14655v1
Date: Mon, 28 Mar 2022 11:16:46 GMT
Title: Few-Shot Learning with Siamese Networks and Label Tuning
Authors: Thomas M\"uller and Guillermo P\'erez-Torr\'o and Marc Franco-Salvador
Abstract summary: We show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative. We introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings.
Score: 5.006086647446482
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot text classification. In recent years, an approach based on neural textual entailment models has been found to give strong results on a diverse range of tasks. In this work, we show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative. These models allow for a large reduction in inference cost: constant in the number of labels rather than linear. Furthermore, we introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings. While giving lower performance than model fine-tuning, this approach has the architectural advantage that a single encoder can be shared by many different tasks.

Related papers

Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages. In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z)
Neural Networks Against (and For) Self-Training: Classification with Small Labeled and Large Unlabeled Sets [11.385682758047775]
One of the weaknesses of self-training is the semantic drift problem. We reshape the role of pseudo-labels and create a hierarchical order of information. A crucial step in self-training is to use the confidence prediction to select the best candidate pseudo-labels.
arXiv Detail & Related papers (2023-12-31T19:25:34Z)
Unlocking the Transferability of Tokens in Deep Models for Tabular Data [67.11727608815636]
Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks. In this paper, we propose TabToken, a method aims at enhancing the quality of feature tokens. We introduce a contrastive objective that regularizes the tokens, capturing the semantics within and across features.
arXiv Detail & Related papers (2023-10-23T17:53:09Z)
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification. We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z)
Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier. Our method is model-agnostic and can be easily applied to generic segmentation models. With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z)
An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective. Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z)
Improving Model Training via Self-learned Label Representations [5.969349640156469]
We show that more sophisticated label representations are better for classification than the usual one-hot encoding. We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task. Our algorithm introduces negligible additional parameters and has a minimal computational overhead.
arXiv Detail & Related papers (2022-09-09T21:10:43Z)
A Simple Yet Effective Pretraining Strategy for Graph Few-shot Learning [38.66690010054665]
We propose a simple transductive fine-tuning based framework as a new paradigm for graph few-shot learning. For pretraining, we propose a supervised contrastive learning framework with data augmentation strategies specific for few-shot node classification.
arXiv Detail & Related papers (2022-03-29T22:30:00Z)
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images. We propose modifications and best practices aimed at minimizing human labeling effort. Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z)
Learning from Noisy Labels for Entity-Centric Information Extraction [17.50856935207308]
We propose a simple co-regularization framework for entity-centric information extraction. These models are jointly optimized with task-specific loss, and are regularized to generate similar predictions. In the end, we can take any of the trained models for inference.
arXiv Detail & Related papers (2021-04-17T22:49:12Z)
Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels. We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly. This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.