Related papers: Zero-Shot Text Classification via Self-Supervised Tuning

Zero-Shot Text Classification via Self-Supervised Tuning

URL: http://arxiv.org/abs/2305.11442v2
Date: Thu, 25 May 2023 06:10:04 GMT
Title: Zero-Shot Text Classification via Self-Supervised Tuning
Authors: Chaoqun Liu, Wenxuan Zhang, Guizhen Chen, Xiaobao Wu, Anh Tuan Luu, Chip Hong Chang, Lidong Bing
Abstract summary: We propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks. tuning the language models with unlabeled data, called self-supervised tuning. Our model outperforms the state-of-the-art baselines on 7 out of 10 tasks.
Score: 46.9902502503747
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive to the choices of templates, or rely on large-scale annotated data of relevant tasks for meta-tuning. In this work, we propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks by tuning the language models with unlabeled data, called self-supervised tuning. By exploring the inherent structure of free texts, we propose a new learning objective called first sentence prediction to bridge the gap between unlabeled data and text classification tasks. After tuning the model to learn to predict the first sentence in a paragraph based on the rest, the model is able to conduct zero-shot inference on unseen tasks such as topic classification and sentiment analysis. Experimental results show that our model outperforms the state-of-the-art baselines on 7 out of 10 tasks. Moreover, the analysis reveals that our model is less sensitive to the prompt design. Our code and pre-trained models are publicly available at https://github.com/DAMO-NLP-SG/SSTuning .

Related papers

Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning [74.56097953187994]
We present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks. TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations. TICL achieves up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks.
arXiv Detail & Related papers (2025-02-13T05:20:21Z)
Ensembling Finetuned Language Models for Text Classification [55.15643209328513]
Finetuning is a common practice across different communities to adapt pretrained models to particular tasks. ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates. We present a metadataset with predictions from five large finetuned models on six datasets and report results of different ensembling strategies.
arXiv Detail & Related papers (2024-10-25T09:15:54Z)
Self-Supervised Representation Learning for Online Handwriting Text Classification [0.8594140167290099]
We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages. To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods. The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
arXiv Detail & Related papers (2023-10-10T14:07:49Z)
POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models [62.23255433487586]
We propose an unsupervised fine-tuning framework to fine-tune the model or prompt on the unlabeled target data. We demonstrate how to apply our method to both language-augmented vision and masked-language models by aligning the discrete distributions extracted from the prompts and target data.
arXiv Detail & Related papers (2023-04-29T22:05:22Z)
Zero-Shot Text Classification with Self-Training [8.68603153534916]
We show that fine-tuning the zero-shot classifier on its most confident predictions leads to significant performance gains across a wide range of text classification tasks. Self-training adapts the zero-shot model to the task at hand.
arXiv Detail & Related papers (2022-10-31T17:55:00Z)
A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification. The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample. A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z)
Language Models in the Loop: Incorporating Prompting into Weak Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z)
ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models [0.0]
Deep learning approaches based on CNN, LSTM, and Transformers have been the de facto approach for text classification. We show that these systems are over-reliant on the important words present in the text that are useful for classification.
arXiv Detail & Related papers (2021-01-30T15:18:35Z)
Cold-start Active Learning through Self-supervised Language Modeling [15.551710499866239]
Active learning aims to reduce annotation costs by choosing the most critical examples to label. With BERT, we develop a simple strategy based on the masked language modeling loss. Compared to other baselines, our approach reaches higher accuracy within less sampling iterations and time.
arXiv Detail & Related papers (2020-10-19T14:09:17Z)
Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck. We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network. We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words" Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.