Related papers: Generative Adversarial Networks for Annotated Data Augmentation in Data Sparse NLU

Generative Adversarial Networks for Annotated Data Augmentation in Data Sparse NLU

URL: http://arxiv.org/abs/2012.05302v1
Date: Wed, 9 Dec 2020 20:38:17 GMT
Title: Generative Adversarial Networks for Annotated Data Augmentation in Data Sparse NLU
Authors: Olga Golovneva and Charith Peris
Abstract summary: Data sparsity is one of the key challenges associated with model development in Natural Language Understanding. We present our results on boosting NLU model performance through training data augmentation using a sequential generative adversarial network (GAN) Our experiments reveal synthetic data generated using the sequential generative adversarial network provides significant performance boosts across multiple metrics.
Score: 0.76146285961466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data sparsity is one of the key challenges associated with model development in Natural Language Understanding (NLU) for conversational agents. The challenge is made more complex by the demand for high quality annotated utterances commonly required for supervised learning, usually resulting in weeks of manual labor and high cost. In this paper, we present our results on boosting NLU model performance through training data augmentation using a sequential generative adversarial network (GAN). We explore data generation in the context of two tasks, the bootstrapping of a new language and the handling of low resource features. For both tasks we explore three sequential GAN architectures, one with a token-level reward function, another with our own implementation of a token-level Monte Carlo rollout reward, and a third with sentence-level reward. We evaluate the performance of these feedback models across several sampling methodologies and compare our results to upsampling the original data to the same scale. We further improve the GAN model performance through the transfer learning of the pretrained embeddings. Our experiments reveal synthetic data generated using the sequential generative adversarial network provides significant performance boosts across multiple metrics and can be a major benefit to the NLU tasks.

Related papers

Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Towards Robust Universal Information Extraction: Benchmark, Evaluation, and Solution [66.11004226578771]
Existing robust benchmark datasets have two key limitations. They generate only a limited range of perturbations for a single Information Extraction (IE) task. Considering the powerful generation capabilities of Large Language Models (LLMs), we introduce a new benchmark dataset for Robust UIE, called RUIE-Bench. We show that training with only textbf15% of the data leads to an average textbf7.5% relative performance improvement across three IE tasks.
arXiv Detail & Related papers (2025-03-05T05:39:29Z)
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs [5.89889361990138]
Large language models (LLMs) have demonstrated remarkable performance in diverse tasks using zero-shot and few-shot prompting. In this work, we tackle the challenge of generating datasets with high diversity, upon which a student model is trained for downstream tasks. Taking the route of decoding-time guidance-based approaches, we propose Corr Synth, which generates data that is more diverse and faithful to the input prompt using a correlated sampling strategy.
arXiv Detail & Related papers (2024-11-13T12:09:23Z)
Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch [54.12139707822201]
We propose ScaleQuest, a novel, scalable, and cost-effective data synthesis method.<n>By generating diverse questions from scratch, we produce a dataset of 1 million problem-solution pairs.<n>Our experiments demonstrate that models trained on our data outperform existing open-source datasets.
arXiv Detail & Related papers (2024-10-24T12:42:04Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
An Active Learning Framework for Inclusive Generation by Large Language Models [32.16984263644299]
Large Language Models (LLMs) generate text representative of diverse sub-populations. We propose a novel clustering-based active learning framework, enhanced with knowledge distillation. We construct two new datasets in tandem with model training, showing a performance improvement of 2%-10% over baseline models.
arXiv Detail & Related papers (2024-10-17T15:09:35Z)
Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs) We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery. Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data. In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z)
Regularization Through Simultaneous Learning: A Case Study on Plant Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning. We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function. Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
DQI: Measuring Data Quality in NLP [22.54066527822898]
We introduce a generic formula for Data Quality Index (DQI) to help dataset creators create datasets free of unwanted biases. We show that models trained on the renovated SNLI dataset generalize better to out of distribution tasks.
arXiv Detail & Related papers (2020-05-02T12:34:17Z)
Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting. G-DAUGC consistently outperforms existing data augmentation methods based on back-translation. Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.