WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for
Wikipedia Categories
- URL: http://arxiv.org/abs/2307.15293v1
- Date: Fri, 28 Jul 2023 04:17:41 GMT
- Title: WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for
Wikipedia Categories
- Authors: Te-Yu Chi, Yu-Meng Tang, Chia-Wen Lu, Qiu-Xia Zhang, Jyh-Shing Roger
Jang
- Abstract summary: Our research focuses on solving the zero-shot text classification problem in NLP.
We propose a novel self-training strategy that uses labels rather than text for training.
Our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
- Score: 5.652290685410878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our research focuses on solving the zero-shot text classification problem in
NLP, with a particular emphasis on innovative self-training strategies. To
achieve this objective, we propose a novel self-training strategy that uses
labels rather than text for training, significantly reducing the model's
training time. Specifically, we use categories from Wikipedia as our training
set and leverage the SBERT pre-trained model to establish positive correlations
between pairs of categories within the same text, facilitating associative
training. For new test datasets, we have improved the original self-training
approach, eliminating the need for prior training and testing data from each
target dataset. Instead, we adopt Wikipedia as a unified training dataset to
better approximate the zero-shot scenario. This modification allows for rapid
fine-tuning and inference across different datasets, greatly reducing the time
required for self-training. Our experimental results demonstrate that this
method can adapt the model to the target dataset within minutes. Compared to
other BERT-based transformer models, our approach significantly reduces the
amount of training data by training only on labels, not the actual text, and
greatly improves training efficiency by utilizing a unified training set.
Additionally, our method achieves state-of-the-art results on both the Yahoo
Topic and AG News datasets.
Related papers
- Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Iterative Loop Learning Combining Self-Training and Active Learning for
Domain Adaptive Semantic Segmentation [1.827510863075184]
Self-training and active learning have been proposed to alleviate this problem.
This paper proposes an iterative loop learning method combining Self-Training and Active Learning.
arXiv Detail & Related papers (2023-01-31T01:31:43Z) - Robustifying Sentiment Classification by Maximally Exploiting Few
Counterfactuals [16.731183915325584]
We propose a novel solution that only requires annotation of a small fraction of the original training data.
We achieve noticeable accuracy improvements by adding only 1% manual counterfactuals.
arXiv Detail & Related papers (2022-10-21T08:30:09Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.