Related papers: P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models

P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models

URL: http://arxiv.org/abs/2406.11391v1
Date: Mon, 17 Jun 2024 10:22:00 GMT
Title: P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models
Authors: Shuo Yang, Chenchen Yuan, Yao Rong, Felix Steinbauer, Gjergji Kasneci,
Abstract summary: We propose using proximal policy optimization (PPO) to apply Generative Adversarial Networks (GANs) PPO leads to an approximately 4% improvement in the accuracy of models trained on synthetically generated data over state-of-the-art datasets.
Score: 15.969452637480167
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: A multitude of industries depend on accurate and reasonable tabular data augmentation for their business processes. Contemporary methodologies in generating tabular data revolve around utilizing Generative Adversarial Networks (GAN) or fine-tuning Large Language Models (LLM). However, GAN-based approaches are documented to produce samples with common-sense errors attributed to the absence of external knowledge. On the other hand, LLM-based methods exhibit a limited capacity to capture the disparities between synthesized and actual data distribution due to the absence of feedback from a discriminator during training. Furthermore, the decoding of LLM-based generation introduces gradient breakpoints, impeding the backpropagation of loss from a discriminator, thereby complicating the integration of these two approaches. To solve this challenge, we propose using proximal policy optimization (PPO) to apply GANs, guiding LLMs to enhance the probability distribution of tabular features. This approach enables the utilization of LLMs as generators for GANs in synthesizing tabular data. Our experiments demonstrate that PPO leads to an approximately 4\% improvement in the accuracy of models trained on synthetically generated data over state-of-the-art across three real-world datasets.

Related papers

In-Context Bias Propagation in LLM-Based Tabular Data Generation [2.182762698614784]
We show that even mild in-context biases lead to global statistical distortions.<n>We introduce an adversarial scenario where a malicious contributor can inject bias into the synthetic dataset.<n>Our findings demonstrate a new vulnerability associated with LLM-based data generation pipelines.
arXiv Detail & Related papers (2025-06-11T11:39:29Z)
A Note on Statistically Accurate Tabular Data Generation Using Large Language Models [0.0]
This work introduces a probability-driven prompting approach that leverages large language models to estimate conditional distributions.<n>Results highlight the potential of prompting probability distributions to enhance the statistical fidelity of large language models-generated data.
arXiv Detail & Related papers (2025-05-05T14:05:15Z)
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts [54.654823811482665]
Large language models (LLMs) increasingly rely on preference alignment methods to steer outputs toward human values. Recent approaches have turned to synthetic data generated by LLMs as a scalable alternative. We propose a novel distribution-aware optimization framework that improves preference alignment in the presence of such shifts.
arXiv Detail & Related papers (2025-04-08T09:14:38Z)
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data [36.277423093218275]
We study the role of data diversity in enhancing the overall abilities of large language models (LLMs) We propose a new method that gives the LLM a dual identity: an output model to cognitively probe and select data based on diversity reward, as well as an input model to be tuned with the selected data.
arXiv Detail & Related papers (2025-02-05T17:21:01Z)
SampleLLM: Optimizing Tabular Data Synthesis in Recommendations [46.689486044254544]
Tabular data synthesis is crucial in machine learning, yet existing general methods are highly data-dependent and often fall short in recommender systems. This limitation arises from their difficulty in capturing complex distributions and understanding feature relationships from sparse and limited data. We propose a novel two-stage framework named SampleLLM to improve the quality of LLM-based data synthesis for recommendation tasks.
arXiv Detail & Related papers (2025-01-27T15:12:27Z)
Large Language Models for Market Research: A Data-augmentation Approach [3.3199591445531453]
Large Language Models (LLMs) have transformed artificial intelligence by excelling in complex natural language processing tasks. Recent studies highlight a significant gap between LLM-generated and human data, with biases introduced when substituting between the two. We propose a novel statistical data augmentation approach that efficiently integrates LLM-generated data with real data in conjoint analysis.
arXiv Detail & Related papers (2024-12-26T22:06:29Z)
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification [7.357494019212501]
We propose efficient weighted-loss approaches to align synthetic data with real-world distribution. We empirically assessed the effectiveness of our method on multiple text classification tasks.
arXiv Detail & Related papers (2024-10-28T20:53:49Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
Entropy Law: The Story Behind Data Compression and LLM Performance [115.70395740286422]
We find that model performance is negatively correlated to the compression ratio of training data, which usually yields a lower training loss. Based on the findings of the entropy law, we propose a quite efficient and universal data selection method. We also present an interesting application of entropy law that can detect potential performance risks at the beginning of model training.
arXiv Detail & Related papers (2024-07-09T08:14:29Z)
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs) Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws. Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z)
CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models [0.18416014644193068]
This paper introduces the Contextual Language model for Accurate Imputation Method (CLAIM) Unlike traditional imputation methods, CLAIM utilizes contextually relevant natural language descriptors to fill missing values. Our evaluations across diverse datasets and missingness patterns reveal CLAIM's superior performance over existing imputation techniques.
arXiv Detail & Related papers (2024-05-28T00:08:29Z)
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs [0.0]
We introduce emphrefined Direct Preference Optimization (rDPO), a method for improving the behavioral alignment of Large Language Models (LLMs) without the need for human-annotated data. The method involves creating synthetic data using self-critique prompting by a teacher LLM and then utilising a generalized DPO loss function to distil to a student LLM. The loss function incorporates an additional external reward model to improve the quality of synthetic data, making rDPO robust to potential noise in the synthetic dataset.
arXiv Detail & Related papers (2024-02-12T19:10:13Z)
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime. We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Mixture of Soft Prompts for Controllable Data Generation [21.84489422361048]
Mixture of Soft Prompts (MSP) is proposed as a tool for data augmentation rather than direct prediction. Our method achieves state-of-the-art results on three benchmarks when compared against strong baselines.
arXiv Detail & Related papers (2023-03-02T21:13:56Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.