Robust Bias Detection in MLMs and its Application to Human Trait Ratings
- URL: http://arxiv.org/abs/2502.15600v1
- Date: Fri, 21 Feb 2025 17:18:02 GMT
- Title: Robust Bias Detection in MLMs and its Application to Human Trait Ratings
- Authors: Ingroj Shrestha, Louis Tay, Padmini Srinivasan,
- Abstract summary: We propose a systematic statistical approach to quantify bias using mixed models.<n>We explore the novel problem of gender bias in the context of $textitpersonality$ and $textitcharacter$ traits.<n>We find that ALBERT is unbiased for binary gender but the most biased for non-binary $textitneo$, while RoBERTa-large is the most biased for binary gender but shows small no bias for $textitneo$.
- Score: 10.067718208225203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been significant prior work using templates to study bias against demographic attributes in MLMs. However, these have limitations: they overlook random variability of templates and target concepts analyzed, assume equality amongst templates, and overlook bias quantification. Addressing these, we propose a systematic statistical approach to assess bias in MLMs, using mixed models to account for random effects, pseudo-perplexity weights for sentences derived from templates and quantify bias using statistical effect sizes. Replicating prior studies, we match on bias scores in magnitude and direction with small to medium effect sizes. Next, we explore the novel problem of gender bias in the context of $\textit{personality}$ and $\textit{character}$ traits, across seven MLMs (base and large). We find that MLMs vary; ALBERT is unbiased for binary gender but the most biased for non-binary $\textit{neo}$, while RoBERTa-large is the most biased for binary gender but shows small to no bias for $\textit{neo}$. There is some alignment of MLM bias and findings in psychology (human perspective) - in $\textit{agreeableness}$ with RoBERTa-large and $\textit{emotional stability}$ with BERT-large. There is general agreement for the remaining 3 personality dimensions: both sides observe at most small differences across gender. For character traits, human studies on gender bias are limited thus comparisons are not feasible.
Related papers
- Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models [49.41113560646115]
We investigate various proxy measures of bias in large language models (LLMs)<n>We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores.<n>With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle.
arXiv Detail & Related papers (2025-06-12T08:47:40Z) - On Fairness of Unified Multimodal Large Language Model for Image Generation [19.122441856516215]
We benchmark the latest U-MLLMs and find that most exhibit significant demographic biases, such as gender and race bias.
Our analysis shows that bias originates primarily from the language model.
We propose a novel balanced preference model to balance the demographic distribution with synthetic data.
arXiv Detail & Related papers (2025-02-05T18:21:03Z) - How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs.<n>Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z) - Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models [15.53216696218776]
We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts.
We propose $textttATLAS$, a technique to localize bias to specific layers of the LLM by analyzing attention scores and then reduce bias by scaling attention in these biased layers.
arXiv Detail & Related papers (2024-10-29T20:15:56Z) - With a Grain of SALT: Are LLMs Fair Across Social Dimensions? [3.5001789247699535]
This paper presents a systematic analysis of biases in open-source Large Language Models (LLMs) across gender, religion, and race.<n>We use the SALT dataset, which incorporates five distinct bias triggers: General Debate, Positioned Debate, Career Advice, Problem Solving, and CV Generation.<n>Our findings reveal consistent polarization across models, with certain demographic groups receiving systematically favorable or unfavorable treatment.
arXiv Detail & Related papers (2024-10-16T12:22:47Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
We introduce VLBiasBench, a benchmark to evaluate biases in Large Vision-Language Models (LVLMs)<n>VLBiasBench features a dataset that covers nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status, as well as two intersectional bias categories: race x gender and race x social economic status.<n>We conduct extensive evaluations on 15 open-source models as well as two advanced closed-source models, yielding new insights into the biases present in these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models [12.12628747941818]
This paper presents a novel framework for benchmarking hierarchical gender hiring bias in Large Language Models (LLMs) for resume scoring.
We introduce a new construct grounded in labour economics, legal principles, and critiques of current bias benchmarks.
We analyze gender hiring biases in ten state-of-the-art LLMs.
arXiv Detail & Related papers (2024-06-17T09:15:57Z) - Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses.
We propose an indirect probing framework based on conditional generation.
We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z) - Quantifying Bias in Text-to-Image Generative Models [49.60774626839712]
Bias in text-to-image (T2I) models can propagate unfair social representations and may be used to aggressively market ideas or push controversial agendas.
Existing T2I model bias evaluation methods only focus on social biases.
We propose an evaluation methodology to quantify general biases in T2I generative models, without any preconceived notions.
arXiv Detail & Related papers (2023-12-20T14:26:54Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias [33.99768156365231]
We introduce a causal formulation for bias measurement in generative language models.
We propose a benchmark called OccuGender, with a bias-measuring procedure to investigate occupational gender bias.
The results show that these models exhibit substantial occupational gender bias.
arXiv Detail & Related papers (2022-12-20T22:41:24Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.