ChatGPT Based Data Augmentation for Improved Parameter-Efficient
Debiasing of LLMs
- URL: http://arxiv.org/abs/2402.11764v1
- Date: Mon, 19 Feb 2024 01:28:48 GMT
- Title: ChatGPT Based Data Augmentation for Improved Parameter-Efficient
Debiasing of LLMs
- Authors: Pengrui Han, Rafal Kocielnik, Adhithya Saravanan, Roy Jiang, Or
Sharir, Anima Anandkumar
- Abstract summary: Large Language models (LLMs) exhibit harmful social biases.
This work introduces a novel approach utilizing ChatGPT to generate synthetic training data.
- Score: 69.27030571729392
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language models (LLMs), while powerful, exhibit harmful social biases.
Debiasing is often challenging due to computational costs, data constraints,
and potential degradation of multi-task language capabilities. This work
introduces a novel approach utilizing ChatGPT to generate synthetic training
data, aiming to enhance the debiasing of LLMs. We propose two strategies:
Targeted Prompting, which provides effective debiasing for known biases but
necessitates prior specification of bias in question; and General Prompting,
which, while slightly less effective, offers debiasing across various
categories. We leverage resource-efficient LLM debiasing using adapter tuning
and compare the effectiveness of our synthetic data to existing debiasing
datasets. Our results reveal that: (1) ChatGPT can efficiently produce
high-quality training data for debiasing other LLMs; (2) data produced via our
approach surpasses existing datasets in debiasing performance while also
preserving internal knowledge of a pre-trained LLM; and (3) synthetic data
exhibits generalizability across categories, effectively mitigating various
biases, including intersectional ones. These findings underscore the potential
of synthetic data in advancing the fairness of LLMs with minimal retraining
cost.
Related papers
- BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization [0.0]
Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns.
This paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in English text.
By developing a loss function that favors less biased over biased completions, our approach cultivates a preference for respectful and non-discriminatory language.
arXiv Detail & Related papers (2024-07-18T22:32:20Z) - Entropy Law: The Story Behind Data Compression and LLM Performance [115.70395740286422]
We find that model performance is negatively correlated to the compression ratio of training data, which usually yields a lower training loss.
Based on the findings of the entropy law, we propose a quite efficient and universal data selection method.
We also present an interesting application of entropy law that can detect potential performance risks at the beginning of model training.
arXiv Detail & Related papers (2024-07-09T08:14:29Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models [0.18416014644193068]
This paper introduces the Contextual Language model for Accurate Imputation Method (CLAIM)
Unlike traditional imputation methods, CLAIM utilizes contextually relevant natural language descriptors to fill missing values.
Our evaluations across diverse datasets and missingness patterns reveal CLAIM's superior performance over existing imputation techniques.
arXiv Detail & Related papers (2024-05-28T00:08:29Z) - Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications [9.982616173090264]
We investigate the usage of large language models (LLMs) for data augmentation as a potential solution to the issue of NLP models making wrong predictions with high confidence during classification tasks.
For mitigation, humans or LLMs provide natural language characterizations of high confidence misclassifications to generate synthetic data, which are then used to extend the training set.
We conduct an extensive evaluation of our approach on three classification tasks and demonstrate its effectiveness in reducing the number of high confidence misclassifications.
arXiv Detail & Related papers (2024-03-26T16:49:25Z) - Learning Fair Ranking Policies via Differentiable Optimization of
Ordered Weighted Averages [55.04219793298687]
This paper shows how efficiently-solvable fair ranking models can be integrated into the training loop of Learning to Rank.
In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
arXiv Detail & Related papers (2024-02-07T20:53:53Z) - The Gaps between Pre-train and Downstream Settings in Bias Evaluation
and Debiasing [74.7319697510621]
In-Context Learning (ICL) induces smaller changes to PLMs compared to FT-based debiasing methods.
ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods.
arXiv Detail & Related papers (2024-01-16T17:15:08Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Self-Supervised Position Debiasing for Large Language Models [39.261233221850155]
We propose a self-supervised position debiasing (SOD) framework to mitigate position bias for large language models (LLMs)
Experiments on eight datasets and five tasks show that SOD consistently outperforms existing methods in mitigating three types of position biases.
arXiv Detail & Related papers (2024-01-02T14:12:41Z) - Adapting LLMs for Efficient, Personalized Information Retrieval: Methods
and Implications [0.7832189413179361]
Large Language Models (LLMs) excel in comprehending and generating human-like text.
This paper explores strategies for integrating Language Models (LLMs) with Information Retrieval (IR) systems.
arXiv Detail & Related papers (2023-11-21T02:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.