Related papers: AGR: Age Group fairness Reward for Bias Mitigation in LLMs

AGR: Age Group fairness Reward for Bias Mitigation in LLMs

URL: http://arxiv.org/abs/2409.04340v1
Date: Fri, 6 Sep 2024 15:18:12 GMT
Title: AGR: Age Group fairness Reward for Bias Mitigation in LLMs
Authors: Shuirong Cao, Ruoxi Cheng, Zhiqiang Wang,
Abstract summary: We construct age bias preference datasets and instruction-tuning datasets for RLHF. We introduce ARG, an age fairness reward to reduce differences in the response quality of LLMs across different age groups.
Score: 3.1244204900991623
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs can exhibit age biases, resulting in unequal treatment of individuals across age groups. While much research has addressed racial and gender biases, age bias remains little explored. The scarcity of instruction-tuning and preference datasets for age bias hampers its detection and measurement, and existing fine-tuning methods seldom address age-related fairness. In this paper, we construct age bias preference datasets and instruction-tuning datasets for RLHF. We introduce ARG, an age fairness reward to reduce differences in the response quality of LLMs across different age groups. Extensive experiments demonstrate that this reward significantly improves response accuracy and reduces performance disparities across age groups. Our source code and datasets are available at the anonymous \href{https://anonymous.4open.science/r/FairRLHF-D445/readme.md}{link}.

Related papers

The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation [73.16564415490113]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by retrieving relevant document from external knowledge sources. We propose two approaches, FairFT and FairFilter, to mitigate the fairness issues introduced by RAG for small-scale LLMs.
arXiv Detail & Related papers (2025-04-11T10:17:10Z)
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models [16.977176752570617]
Large Language Models (LLMs) are increasingly powerful and accessible to human users. Ensuring fairness across diverse demographic groups, i.e., group fairness, is a critical ethical concern. This work benchmarks the group fairness of learned reward models.
arXiv Detail & Related papers (2025-03-10T19:39:39Z)
How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [56.24431208419858]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset. We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.
arXiv Detail & Related papers (2024-10-10T16:01:51Z)
The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models [26.485974783643464]
We find a general inclination of Large Language Models (LLMs) values towards younger demographics, especially when compared to the US population. Although a general inclination can be observed, we also found that this inclination toward younger groups can be different across different value categories.
arXiv Detail & Related papers (2024-04-12T18:36:20Z)
ROBBIE: Robust Bias Evaluation of Large Generative Language Models [27.864027322486375]
Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes. We compare 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. We conduct a comprehensive study of how well 3 bias/toxicity mitigation techniques perform across our suite of measurements.
arXiv Detail & Related papers (2023-11-29T23:03:04Z)
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty. We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z)
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation. We then analyze the sufficient conditions to guarantee fairness for the target dataset. Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z)
On GANs perpetuating biases for face verification [75.99046162669997]
We show that data generated from generative models such as GANs are prone to bias and fairness issues. Specifically GANs trained on FFHQ dataset show bias towards generating white faces in the age group of 20-29.
arXiv Detail & Related papers (2022-08-27T17:47:09Z)
Adaptive Mean-Residue Loss for Robust Facial Age Estimation [7.667560350473354]
We propose a loss function for robust facial age estimation via distribution learning. Experimental results in the datasets FG-NET and CLAP2016 have validated the effectiveness of the proposed loss.
arXiv Detail & Related papers (2022-03-31T16:28:34Z)
Fairness-aware Class Imbalanced Learning [57.45784950421179]
We evaluate long-tail learning methods for tweet sentiment and occupation classification. We extend a margin-loss based approach with methods to enforce fairness.
arXiv Detail & Related papers (2021-09-21T22:16:30Z)
Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models. We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups. We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race. We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns. We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.