Few-Shot Text Classification with Triplet Networks, Data Augmentation,
and Curriculum Learning
- URL: http://arxiv.org/abs/2103.07552v1
- Date: Fri, 12 Mar 2021 22:07:35 GMT
- Title: Few-Shot Text Classification with Triplet Networks, Data Augmentation,
and Curriculum Learning
- Authors: Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu
- Abstract summary: Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories.
This paper explores data augmentation -- a technique particularly suitable for training with limited data.
We find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average.
- Score: 11.66053357388062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot text classification is a fundamental NLP task in which a model aims
to classify text into a large number of categories, given only a few training
examples per category. This paper explores data augmentation -- a technique
particularly suitable for training with limited data -- for this few-shot,
highly-multiclass text classification setting. On four diverse text
classification tasks, we find that common data augmentation techniques can
improve the performance of triplet networks by up to 3.0% on average.
To further boost performance, we present a simple training strategy called
curriculum data augmentation, which leverages curriculum learning by first
training on only original examples and then introducing augmented data as
training progresses. We explore a two-stage and a gradual schedule, and find
that, compared with standard single-stage training, curriculum data
augmentation trains faster, improves performance, and remains robust to high
amounts of noising from augmentation.
Related papers
- Low-Resource Fast Text Classification Based on Intra-Class and Inter-Class Distance Calculation [1.0291559330120414]
We propose a low-resource and fast text classification model called LFTC.
Our approach begins by constructing a compressor list for each class to fully mine the regularity information within intra-class data.
We evaluate LFTC on 9 publicly available benchmark datasets, and the results demonstrate significant improvements in performance and processing time.
arXiv Detail & Related papers (2024-12-13T07:22:13Z) - Open-Vocabulary Temporal Action Localization using Multimodal Guidance [67.09635853019005]
OVTAL enables a model to recognize any desired action category in videos without the need to explicitly curate training data for all categories.
This flexibility poses significant challenges, as the model must recognize not only the action categories seen during training but also novel categories specified at inference.
We introduce OVFormer, a novel open-vocabulary framework extending ActionFormer with three key contributions.
arXiv Detail & Related papers (2024-06-21T18:00:05Z) - Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - Text generation for dataset augmentation in security classification
tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks.
We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z) - WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for
Wikipedia Categories [5.652290685410878]
Our research focuses on solving the zero-shot text classification problem in NLP.
We propose a novel self-training strategy that uses labels rather than text for training.
Our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
arXiv Detail & Related papers (2023-07-28T04:17:41Z) - Prefer to Classify: Improving Text Classifiers via Auxiliary Preference
Learning [76.43827771613127]
In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation.
We propose a novel multi-task learning framework, called prefer-to-classify (P2C), which can enjoy the cooperative effect of learning both the given classification task and the auxiliary preferences.
arXiv Detail & Related papers (2023-06-08T04:04:47Z) - Teacher Guided Training: An Efficient Framework for Knowledge Transfer [86.6784627427194]
We propose the teacher-guided training (TGT) framework for training a high-quality compact model.
TGT exploits the fact that the teacher has acquired a good representation of the underlying data domain.
We find that TGT can improve accuracy on several image classification benchmarks and a range of text classification and retrieval tasks.
arXiv Detail & Related papers (2022-08-14T10:33:58Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z) - Adapting Deep Learning for Sentiment Classification of Code-Switched
Informal Short Text [1.6752182911522517]
We present a labeled dataset called MultiSenti for sentiment classification of code-switched informal short text.
We propose a deep learning-based model for sentiment classification of code-switched informal short text.
arXiv Detail & Related papers (2020-01-04T06:31:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.