Related papers: LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems

LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems

URL: http://arxiv.org/abs/2510.13202v1
Date: Wed, 15 Oct 2025 06:42:35 GMT
Title: LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems
Authors: Sai Suhruth Reddy Karri, Yashwanth Sai Nallapuneni, Laxmi Narasimha Reddy Mallireddy, Gopichand G,
Abstract summary: Underrepresentation of certain groups often leads to uneven performance across demographics.<n>To address these challenges, we propose LLM-Guided Synthetic Augmentation (LGSA)<n>LGSA uses large language models to generate counterfactual examples for underrepresented groups while preserving label integrity.
Score: 0.24699742392288992
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bias in AI systems, especially those relying on natural language data, raises ethical and practical concerns. Underrepresentation of certain groups often leads to uneven performance across demographics. Traditional fairness methods, such as pre-processing, in-processing, and post-processing, depend on protected-attribute labels, involve accuracy-fairness trade-offs, and may not generalize across datasets. To address these challenges, we propose LLM-Guided Synthetic Augmentation (LGSA), which uses large language models to generate counterfactual examples for underrepresented groups while preserving label integrity. We evaluated LGSA on a controlled dataset of short English sentences with gendered pronouns, professions, and binary classification labels. Structured prompts were used to produce gender-swapped paraphrases, followed by quality control including semantic similarity checks, attribute verification, toxicity screening, and human spot checks. The augmented dataset expanded training coverage and was used to train a classifier under consistent conditions. Results show that LGSA reduces performance disparities without compromising accuracy. The baseline model achieved 96.7 percent accuracy with a 7.2 percent gender bias gap. Simple swap augmentation reduced the gap to 0.7 percent but lowered accuracy to 95.6 percent. LGSA achieved 99.1 percent accuracy with a 1.9 percent bias gap, improving performance on female-labeled examples. These findings demonstrate that LGSA is an effective strategy for bias mitigation, enhancing subgroup balance while maintaining high task accuracy and label fidelity.

Related papers

Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification [10.66892435479991]
Face gender classification models often reflect and amplify demographic biases present in their training data.<n>We introduce pseudo-balancing, a simple and effective strategy for mitigating such biases in semi-supervised learning.<n>Our method enforces demographic balance during pseudo-label selection, using only unlabeled images from a race-balanced dataset.
arXiv Detail & Related papers (2025-10-11T12:08:40Z)
Judging with Confidence: Calibrating Autoraters to Preference Distributions [56.17041629492863]
We argue that a reliable autorater must learn to model the full distribution of preferences defined by a target population.<n>We present two learning methods tailored to different data conditions.<n>Our results show that finetuning autoraters with a distribution-matching objective leads to verbalized probability predictions that are better aligned with the target preference distribution.
arXiv Detail & Related papers (2025-09-30T20:36:41Z)
Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks [0.0]
Large Language Models (LLMs) can exhibit latent biases towards specific nationalities even when explicit demographic markers are not present.<n>We introduce a novel name-based benchmarking approach to investigate the impact of substituting explicit nationality labels with culturally indicative names.<n>Our experiments show that small models are less accurate and exhibit more bias compared to their larger counterparts.
arXiv Detail & Related papers (2025-07-22T19:54:49Z)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach [53.824673312331626]
Implicit Demography Inference (IDI) module uses k-means clustering to mitigate bias in Speech Emotion Recognition (SER)<n>Experiments show that pseudo-labeling IDI reduces subgroup disparities, improving fairness metrics by over 28%.<n>Unsupervised IDI yields more than a 4.6% improvement in fairness metrics with a drop of less than 3.6% in SER performance.
arXiv Detail & Related papers (2025-05-20T14:50:44Z)
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [91.86718720024825]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.<n>Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.<n>We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z)
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.<n>GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z)
Individual Fairness Through Reweighting and Tuning [0.23395944472515745]
Inherent bias within society can be amplified and perpetuated by artificial intelligence (AI) systems. Recently, Graph Laplacian Regularizer (GLR) has been used as a substitute for the common Lipschitz condition to enhance individual fairness. In this work, we investigated whether defining a GLR independently on the train and target data could maintain similar accuracy.
arXiv Detail & Related papers (2024-05-02T20:15:25Z)
FAIRLABEL: Correcting Bias in Labels [2.810160553339817]
We propose FAIRLABEL, an algorithm which detects and corrects biases in labels. The goal of FAIRLABELis to reduce the Disparate Impact (DI) across groups while maintaining high accuracy in predictions.
arXiv Detail & Related papers (2023-11-01T16:38:27Z)
Fair-CDA: Continuous and Directional Augmentation for Group Fairness [48.84385689186208]
We propose a fine-grained data augmentation strategy for imposing fairness constraints. We show that group fairness can be achieved by regularizing the models on transition paths of sensitive features between groups. Our proposed method does not assume any data generative model and ensures good generalization for both accuracy and fairness.
arXiv Detail & Related papers (2023-04-01T11:23:00Z)
Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models. We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups. We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
Post-Comparison Mitigation of Demographic Bias in Face Recognition Using Fair Score Normalization [15.431761867166]
We propose a novel unsupervised fair score normalization approach to reduce the effect of bias in face recognition. Our solution reduces demographic biases by up to 82.7% in the case when gender is considered. In contrast to previous works, our fair normalization approach enhances the overall performance by up to 53.2% at false match rate of 0.001 and up to 82.9% at a false match rate of 0.00001.
arXiv Detail & Related papers (2020-02-10T08:17:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.