Related papers: Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

URL: http://arxiv.org/abs/2502.04419v2
Date: Mon, 10 Feb 2025 16:34:03 GMT
Title: Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks
Authors: Miaomiao Li, Hao Chen, Yang Wang, Tingyuan Zhu, Weijia Zhang, Kaijie Zhu, Kam-Fai Wong, Jindong Wang,
Abstract summary: This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance.<n>We analyze how 6 different types of biases manifest at varying bias ratios.<n>We propose three mitigation strategies: token-based, mask-based, and loss-based approaches.
Score: 24.706895491806794
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generating synthetic datasets via large language models (LLMs) themselves has emerged as a promising approach to improve LLM performance. However, LLMs inherently reflect biases present in their training data, leading to a critical challenge: when these models generate synthetic data for training, they may propagate and amplify their inherent biases that can significantly impact model fairness and robustness on downstream tasks--a phenomenon we term bias inheritance. This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance. We study this problem by fine-tuning LLMs with a combined dataset consisting of original and LLM-augmented data, where bias ratio represents the proportion of augmented data. Through systematic experiments across 10 classification and generation tasks, we analyze how 6 different types of biases manifest at varying bias ratios. Our results reveal that bias inheritance has nuanced effects on downstream tasks, influencing both classification tasks and generation tasks differently. Then, our analysis identifies three key misalignment factors: misalignment of values, group data, and data distributions. Based on these insights, we propose three mitigation strategies: token-based, mask-based, and loss-based approaches. Experiments demonstrate that these strategies also work differently on various tasks and bias, indicating the substantial challenges to fully mitigate bias inheritance. We hope this work can provide valuable insights to the research of LLM data augmentation.

Related papers

What Matters in LLM-generated Data: Diversity and Its Effect on Model Fine-Tuning [22.43647238560673]
We show how varying levels of diversity in LLM-generated data affect downstream model performance.<n>We also investigate the performance of models trained on data that mixes different proportions of LLM-generated data.
arXiv Detail & Related papers (2025-06-24T02:44:58Z)
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z)
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models [40.853803921563596]
Current large language models (LLMs) may still capture dataset biases and utilize them during inference. We propose an information gain-guided causal intervention debiasing framework. IGCIDB can effectively debias LLM to improve its generalizability across different tasks.
arXiv Detail & Related papers (2025-04-17T12:39:25Z)
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data [36.277423093218275]
We study the role of data diversity in enhancing the overall abilities of large language models (LLMs)<n>We propose a new method that gives the LLM a dual identity: an output model to cognitively probe and select data based on diversity reward, as well as an input model to be tuned with the selected data.
arXiv Detail & Related papers (2025-02-05T17:21:01Z)
Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods.<n>In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z)
Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning [61.99353167168545]
We show that fine-tuning with LLM-generated data improves target task performance and reduces non-target task degradation.<n>This is the first work to provide an empirical explanation based on token perplexity reduction to mitigate catastrophic forgetting in LLMs after fine-tuning.
arXiv Detail & Related papers (2025-01-24T08:18:56Z)
How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs.<n>Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM Interactions [25.809599403713506]
Large Language Models (LLMs) are increasingly being employed in numerous studies to simulate societies and execute diverse social tasks. LLMs are susceptible to societal biases due to their exposure to human-generated data. This study investigates the presence of implicit gender biases in multi-agent LLM interactions and proposes two strategies to mitigate these biases.
arXiv Detail & Related papers (2024-10-03T15:28:05Z)
Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases. LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z)
Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities. However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender. This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z)
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks. To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets. We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z)
Interpreting Bias in Large Language Models: A Feature-Based Approach [0.0]
Large Language Models (LLMs) have showcased remarkable performance across various natural language processing (NLP) tasks. This paper investigates the propagation of biases within LLMs through a novel feature-based analytical approach.
arXiv Detail & Related papers (2024-06-18T07:28:15Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.