In-Contextual Gender Bias Suppression for Large Language Models
- URL: http://arxiv.org/abs/2309.07251v2
- Date: Tue, 20 Feb 2024 15:11:17 GMT
- Title: In-Contextual Gender Bias Suppression for Large Language Models
- Authors: Daisuke Oba, Masahiro Kaneko, Danushka Bollegala
- Abstract summary: Large Language Models (LLMs) have been reported to encode worrying-levels of gender biases.
We propose bias suppression that prevents biased generations of LLMs by providing preambles constructed from manually designed templates.
We find that bias suppression has acceptable adverse effect on downstream task performance with HellaSwag and COPA.
- Score: 47.246504807946884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite their impressive performance in a wide range of NLP tasks, Large
Language Models (LLMs) have been reported to encode worrying-levels of gender
biases. Prior work has proposed debiasing methods that require human labelled
examples, data augmentation and fine-tuning of LLMs, which are computationally
costly. Moreover, one might not even have access to the model parameters for
performing debiasing such as in the case of closed LLMs such as GPT-4. To
address this challenge, we propose bias suppression that prevents biased
generations of LLMs by simply providing textual preambles constructed from
manually designed templates and real-world statistics, without accessing to
model parameters. We show that, using CrowsPairs dataset, our textual preambles
covering counterfactual statements can suppress gender biases in English LLMs
such as LLaMA2. Moreover, we find that gender-neutral descriptions of
gender-biased objects can also suppress their gender biases. Moreover, we show
that bias suppression has acceptable adverse effect on downstream task
performance with HellaSwag and COPA.
Related papers
- GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization [0.0]
Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns.
This paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in English text.
By developing a loss function that favors less biased over biased completions, our approach cultivates a preference for respectful and non-discriminatory language.
arXiv Detail & Related papers (2024-07-18T22:32:20Z) - Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses.
We propose an indirect probing framework based on conditional generation.
We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z) - Self-Debiasing Large Language Models: Zero-Shot Recognition and
Reduction of Stereotypes [73.12947922129261]
We leverage the zero-shot capabilities of large language models to reduce stereotyping.
We show that self-debiasing can significantly reduce the degree of stereotyping across nine different social groups.
We hope this work opens inquiry into other zero-shot techniques for bias mitigation.
arXiv Detail & Related papers (2024-02-03T01:40:11Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content.
This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z) - Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language
Models [9.534831387705312]
Existing solutions require debiasing training processes and datasets for debiasing.
Gender-tuning integrates Masked Language Modeling (MLM) training objectives into fine-tuning's training process.
Comprehensive experiments show that Gender-tuning outperforms the state-of-the-art baselines in terms of average gender bias scores in PLMs.
arXiv Detail & Related papers (2023-07-20T01:48:51Z) - Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias [33.99768156365231]
We introduce a causal formulation for bias measurement in generative language models.
We propose a benchmark called OccuGender, with a bias-measuring procedure to investigate occupational gender bias.
The results show that these models exhibit substantial occupational gender bias.
arXiv Detail & Related papers (2022-12-20T22:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.