Related papers: Testing Occupational Gender Bias in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

Testing Occupational Gender Bias in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

URL: http://arxiv.org/abs/2212.10678v2
Date: Mon, 15 Jul 2024 15:10:45 GMT
Title: Testing Occupational Gender Bias in Language Models: Towards Robust Measurement and Zero-Shot Debiasing
Authors: Yuen Chen, Vethavikashini Chithrra Raghuram, Justus Mattern, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf, Zhijing Jin,
Abstract summary: Large language models (LLMs) have been shown to exhibit a variety of harmful, human-like biases against various demographics. We introduce a list of desiderata for robustly measuring biases in generative language models. We then use this benchmark to test several state-of-the-art open-source LLMs, including Llama, Mistral, and their instruction-tuned versions.
Score: 98.07536837448293
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Generated texts from large language models (LLMs) have been shown to exhibit a variety of harmful, human-like biases against various demographics. These findings motivate research efforts aiming to understand and measure such effects. Prior works have proposed benchmarks for identifying and techniques for mitigating these stereotypical associations. However, as recent research pointed out, existing benchmarks lack a robust experimental setup, hindering the inference of meaningful conclusions from their evaluation metrics. In this paper, we introduce a list of desiderata for robustly measuring biases in generative language models. Building upon these design principles, we propose a benchmark called OCCUGENDER, with a bias-measuring procedure to investigate occupational gender bias. We then use this benchmark to test several state-of-the-art open-source LLMs, including Llama, Mistral, and their instruction-tuned versions. The results show that these models exhibit substantial occupational gender bias. We further propose prompting techniques to mitigate these biases without requiring fine-tuning. Finally, we validate the effectiveness of our methods through experiments on the same set of models.

Related papers

Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models [49.41113560646115]
We investigate various proxy measures of bias in large language models (LLMs)<n>We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores.<n>With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle.
arXiv Detail & Related papers (2025-06-12T08:47:40Z)
How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models. Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs. Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z)
VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs) We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status) We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z)
JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models [12.12628747941818]
This paper presents a novel framework for benchmarking hierarchical gender hiring bias in Large Language Models (LLMs) for resume scoring. We introduce a new construct grounded in labour economics, legal principles, and critiques of current bias benchmarks. We analyze gender hiring biases in ten state-of-the-art LLMs.
arXiv Detail & Related papers (2024-06-17T09:15:57Z)
Locating and Mitigating Gender Bias in Large Language Models [40.78150878350479]
Large language models (LLM) are pre-trained on extensive corpora to learn facts and human cognition which contain human preferences. This process can inadvertently lead to these models acquiring biases and prevalent stereotypes in society. We propose the LSDM (Least Square Debias Method), a knowledge-editing based method for mitigating gender bias in occupational pronouns.
arXiv Detail & Related papers (2024-03-21T13:57:43Z)
Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses. We propose an indirect probing framework based on conditional generation. We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z)
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses. We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z)
DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and Bias [13.928591341824248]
Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models. These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions. This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
arXiv Detail & Related papers (2023-10-22T15:27:16Z)
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content. This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z)
A Survey on Gender Bias in Natural Language Processing [22.91475787277623]
We present a survey of 304 papers on gender bias in natural language processing. We compare and contrast approaches to detecting and mitigating gender bias. We find that research on gender bias suffers from four core limitations.
arXiv Detail & Related papers (2021-12-28T14:54:18Z)
Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation [10.542861450223128]
We find grammatical patterns indicating stereotypical and non-stereotypical gender-role assignments in corpora from three domains. We manually verify the quality of our corpus and use it to evaluate gender bias in various coreference resolution and machine translation models.
arXiv Detail & Related papers (2021-09-08T18:14:11Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.