Zero-Shot Text Classification with Self-Training
- URL: http://arxiv.org/abs/2210.17541v1
- Date: Mon, 31 Oct 2022 17:55:00 GMT
- Title: Zero-Shot Text Classification with Self-Training
- Authors: Ariel Gera, Alon Halfon, Eyal Shnarch, Yotam Perlitz, Liat Ein-Dor,
Noam Slonim
- Abstract summary: We show that fine-tuning the zero-shot classifier on its most confident predictions leads to significant performance gains across a wide range of text classification tasks.
Self-training adapts the zero-shot model to the task at hand.
- Score: 8.68603153534916
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in large pretrained language models have increased attention
to zero-shot text classification. In particular, models finetuned on natural
language inference datasets have been widely adopted as zero-shot classifiers
due to their promising results and off-the-shelf availability. However, the
fact that such models are unfamiliar with the target task can lead to
instability and performance issues. We propose a plug-and-play method to bridge
this gap using a simple self-training approach, requiring only the class names
along with an unlabeled dataset, and without the need for domain expertise or
trial and error. We show that fine-tuning the zero-shot classifier on its most
confident predictions leads to significant performance gains across a wide
range of text classification tasks, presumably since self-training adapts the
zero-shot model to the task at hand.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Zero-Shot Text Classification via Self-Supervised Tuning [46.9902502503747]
We propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks.
tuning the language models with unlabeled data, called self-supervised tuning.
Our model outperforms the state-of-the-art baselines on 7 out of 10 tasks.
arXiv Detail & Related papers (2023-05-19T05:47:33Z) - POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained
models [62.23255433487586]
We propose an unsupervised fine-tuning framework to fine-tune the model or prompt on the unlabeled target data.
We demonstrate how to apply our method to both language-augmented vision and masked-language models by aligning the discrete distributions extracted from the prompts and target data.
arXiv Detail & Related papers (2023-04-29T22:05:22Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - Beyond prompting: Making Pre-trained Language Models Better Zero-shot
Learners by Clustering Representations [24.3378487252621]
We show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of pre-trained language models.
Our approach achieves an average of 20% absolute improvement over prompt-based zero-shot learning.
arXiv Detail & Related papers (2022-10-29T16:01:51Z) - Text2Model: Text-based Model Induction for Zero-shot Image Classification [38.704831945753284]
We address the challenge of building task-agnostic classifiers using only text descriptions.
We generate zero-shot classifiers using a hypernetwork that receives class descriptions and outputs a multi-class model.
We evaluate this approach in a series of zero-shot classification tasks, for image, point-cloud, and action recognition, using a range of text descriptions.
arXiv Detail & Related papers (2022-10-27T05:19:55Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - ZeroBERTo -- Leveraging Zero-Shot Text Classification by Topic Modeling [57.80052276304937]
This paper proposes a new model, ZeroBERTo, which leverages an unsupervised clustering step to obtain a compressed data representation before the classification task.
We show that ZeroBERTo has better performance for long inputs and shorter execution time, outperforming XLM-R by about 12% in the F1 score in the FolhaUOL dataset.
arXiv Detail & Related papers (2022-01-04T20:08:17Z) - Cold-start Active Learning through Self-supervised Language Modeling [15.551710499866239]
Active learning aims to reduce annotation costs by choosing the most critical examples to label.
With BERT, we develop a simple strategy based on the masked language modeling loss.
Compared to other baselines, our approach reaches higher accuracy within less sampling iterations and time.
arXiv Detail & Related papers (2020-10-19T14:09:17Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.