Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification
- URL: http://arxiv.org/abs/2111.09064v1
- Date: Wed, 17 Nov 2021 12:10:03 GMT
- Title: Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification
- Authors: Aleksandra Edwards, Asahi Ushio, Jose Camacho-Collados, H\'el\`ene de
Ribaupierre, Alun Preece
- Abstract summary: We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
- Score: 59.698811329287174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data augmentation techniques are widely used for enhancing the performance of
machine learning models by tackling class imbalance issues and data sparsity.
State-of-the-art generative language models have been shown to provide
significant gains across different NLP tasks. However, their applicability to
data augmentation for text classification tasks in few-shot settings have not
been fully explored, especially for specialised domains. In this paper, we
leverage GPT-2 (Radford A et al, 2019) for generating artificial training
instances in order to improve classification performance. Our aim is to analyse
the impact the selection process of seed training examples have over the
quality of GPT-generated samples and consequently the classifier performance.
We perform experiments with several seed selection strategies that, among
others, exploit class hierarchical structures and domain expert selection. Our
results show that fine-tuning GPT-2 in a handful of label instances leads to
consistent classification improvements and outperform competitive baselines.
Finally, we show that guiding this process through domain expert selection can
lead to further improvements, which opens up interesting research avenues for
combining generative models and active learning.
Related papers
- Selecting Between BERT and GPT for Text Classification in Political Science Research [4.487884986288122]
We evaluate the effectiveness of BERT-based versus GPT-based models in low-data scenarios.
We conclude by comparing these approaches in terms of performance, ease of use, and cost.
arXiv Detail & Related papers (2024-11-07T07:29:39Z) - Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - Evaluating the performance of state-of-the-art esg domain-specific pre-trained large language models in text classification against existing models and traditional machine learning techniques [0.0]
This research investigates the classification of Environmental, Social, and Governance (ESG) information within textual disclosures.
The aim is to develop and evaluate binary classification models capable of accurately identifying and categorizing E, S and G-related content respectively.
The motivation for this research stems from the growing importance of ESG considerations in investment decisions and corporate accountability.
arXiv Detail & Related papers (2024-09-30T20:08:32Z) - G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation [21.506844286376275]
We propose a novel gradient-based method to automatically select high-quality and diverse instruction finetuning data for machine translation.
Our key innovation centers around analyzing how individual training examples influence the model during training.
arXiv Detail & Related papers (2024-05-21T16:38:13Z) - Enriched BERT Embeddings for Scholarly Publication Classification [0.13654846342364302]
The NSLP 2024 FoRC Task I addresses this challenge organized as a competition.
The goal is to develop a classifier capable of predicting one of 123 predefined classes from the Open Research Knowledge Graph (ORKG) taxonomy of research fields for a given article.
arXiv Detail & Related papers (2024-05-07T09:05:20Z) - Text generation for dataset augmentation in security classification
tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks.
We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.