Related papers: Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

URL: http://arxiv.org/abs/2111.09064v1
Date: Wed, 17 Nov 2021 12:10:03 GMT
Title: Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification
Authors: Aleksandra Edwards, Asahi Ushio, Jose Camacho-Collados, H\'el\`ene de Ribaupierre, Alun Preece
Abstract summary: We leverage GPT-2 for generating artificial training instances in order to improve classification performance. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
Score: 59.698811329287174
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity. State-of-the-art generative language models have been shown to provide significant gains across different NLP tasks. However, their applicability to data augmentation for text classification tasks in few-shot settings have not been fully explored, especially for specialised domains. In this paper, we leverage GPT-2 (Radford A et al, 2019) for generating artificial training instances in order to improve classification performance. Our aim is to analyse the impact the selection process of seed training examples have over the quality of GPT-generated samples and consequently the classifier performance. We perform experiments with several seed selection strategies that, among others, exploit class hierarchical structures and domain expert selection. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements and outperform competitive baselines. Finally, we show that guiding this process through domain expert selection can lead to further improvements, which opens up interesting research avenues for combining generative models and active learning.

Related papers

Beyond QA Pairs: Assessing Parameter-Efficient Fine-Tuning for Fact Embedding in LLMs [0.0]
This paper focuses on improving the fine-tuning process by categorizing question-answer pairs into Factual and Conceptual classes. Two distinct Llama-2 models are fine-tuned based on these classifications and evaluated using larger models like GPT-3.5 Turbo and Gemini. Our results indicate that models trained on conceptual datasets outperform those trained on factual datasets.
arXiv Detail & Related papers (2025-03-03T03:26:30Z)
Selecting Between BERT and GPT for Text Classification in Political Science Research [4.487884986288122]
We evaluate the effectiveness of BERT-based versus GPT-based models in low-data scenarios. We conclude by comparing these approaches in terms of performance, ease of use, and cost.
arXiv Detail & Related papers (2024-11-07T07:29:39Z)
Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs) We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z)
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored. We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches. We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z)
Evaluating the performance of state-of-the-art esg domain-specific pre-trained large language models in text classification against existing models and traditional machine learning techniques [0.0]
This research investigates the classification of Environmental, Social, and Governance (ESG) information within textual disclosures. The aim is to develop and evaluate binary classification models capable of accurately identifying and categorizing E, S and G-related content respectively. The motivation for this research stems from the growing importance of ESG considerations in investment decisions and corporate accountability.
arXiv Detail & Related papers (2024-09-30T20:08:32Z)
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation [21.506844286376275]
We propose a novel gradient-based method to automatically select high-quality and diverse instruction finetuning data for machine translation. Our key innovation centers around analyzing how individual training examples influence the model during training.
arXiv Detail & Related papers (2024-05-21T16:38:13Z)
Enriched BERT Embeddings for Scholarly Publication Classification [0.13654846342364302]
The NSLP 2024 FoRC Task I addresses this challenge organized as a competition. The goal is to develop a classifier capable of predicting one of 123 predefined classes from the Open Research Knowledge Graph (ORKG) taxonomy of research fields for a given article.
arXiv Detail & Related papers (2024-05-07T09:05:20Z)
Text generation for dataset augmentation in security classification tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks. We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences. Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting. G-DAUGC consistently outperforms existing data augmentation methods based on back-translation. Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.