Related papers: Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

URL: http://arxiv.org/abs/2103.07552v1
Date: Fri, 12 Mar 2021 22:07:35 GMT
Title: Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning
Authors: Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu
Abstract summary: Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories. This paper explores data augmentation -- a technique particularly suitable for training with limited data. We find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average.
Score: 11.66053357388062
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category. This paper explores data augmentation -- a technique particularly suitable for training with limited data -- for this few-shot, highly-multiclass text classification setting. On four diverse text classification tasks, we find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average. To further boost performance, we present a simple training strategy called curriculum data augmentation, which leverages curriculum learning by first training on only original examples and then introducing augmented data as training progresses. We explore a two-stage and a gradual schedule, and find that, compared with standard single-stage training, curriculum data augmentation trains faster, improves performance, and remains robust to high amounts of noising from augmentation.

Related papers

Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data [0.0]
We propose an approach that integrates large language models (LLMs) into an active learning framework. Our approach achieves high cross-task text classification performance without the need for any manually labeled data.
arXiv Detail & Related papers (2025-02-24T06:43:19Z)
Low-Resource Fast Text Classification Based on Intra-Class and Inter-Class Distance Calculation [1.0291559330120414]
We propose a low-resource and fast text classification model called LFTC. Our approach begins by constructing a compressor list for each class to fully mine the regularity information within intra-class data. We evaluate LFTC on 9 publicly available benchmark datasets, and the results demonstrate significant improvements in performance and processing time.
arXiv Detail & Related papers (2024-12-13T07:22:13Z)
Open-Vocabulary Temporal Action Localization using Multimodal Guidance [67.09635853019005]
OVTAL enables a model to recognize any desired action category in videos without the need to explicitly curate training data for all categories. This flexibility poses significant challenges, as the model must recognize not only the action categories seen during training but also novel categories specified at inference. We introduce OVFormer, a novel open-vocabulary framework extending ActionFormer with three key contributions.
arXiv Detail & Related papers (2024-06-21T18:00:05Z)
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z)
Text generation for dataset augmentation in security classification tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks. We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z)
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories [5.652290685410878]
Our research focuses on solving the zero-shot text classification problem in NLP. We propose a novel self-training strategy that uses labels rather than text for training. Our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
arXiv Detail & Related papers (2023-07-28T04:17:41Z)
Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning [76.43827771613127]
In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation. We propose a novel multi-task learning framework, called prefer-to-classify (P2C), which can enjoy the cooperative effect of learning both the given classification task and the auxiliary preferences.
arXiv Detail & Related papers (2023-06-08T04:04:47Z)
Boosting Visual-Language Models by Exploiting Hard Samples [126.35125029639168]
HELIP is a cost-effective strategy tailored to enhance the performance of existing CLIP models. Our method allows for effortless integration with existing models' training pipelines. On comprehensive benchmarks, HELIP consistently boosts existing models to achieve leading performance.
arXiv Detail & Related papers (2023-05-09T07:00:17Z)
Teacher Guided Training: An Efficient Framework for Knowledge Transfer [86.6784627427194]
We propose the teacher-guided training (TGT) framework for training a high-quality compact model. TGT exploits the fact that the teacher has acquired a good representation of the underlying data domain. We find that TGT can improve accuracy on several image classification benchmarks and a range of text classification and retrieval tasks.
arXiv Detail & Related papers (2022-08-14T10:33:58Z)
Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation. Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z)
Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z)
ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification [21.933876113300897]
We adopt an alternative approach by transfer learning on an ensemble of related tasks using prototypical networks under the meta-learning paradigm. Using intent classification as a case study, we demonstrate that increasing variability in training tasks can significantly improve classification performance.
arXiv Detail & Related papers (2021-01-28T00:19:13Z)
Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text [1.6752182911522517]
We present a labeled dataset called MultiSenti for sentiment classification of code-switched informal short text. We propose a deep learning-based model for sentiment classification of code-switched informal short text.
arXiv Detail & Related papers (2020-01-04T06:31:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.