Measuring Social Biases in Masked Language Models by Proxy of Prediction
Quality
- URL: http://arxiv.org/abs/2402.13954v1
- Date: Wed, 21 Feb 2024 17:33:13 GMT
- Title: Measuring Social Biases in Masked Language Models by Proxy of Prediction
Quality
- Authors: Rahul Zalkikar, Kanchan Chandra
- Abstract summary: Social political scientists often aim to discover and measure distinct biases from text data representations (embeddings)
In this paper, we evaluate the social biases encoded by transformers trained with a masked language modeling objective.
We find that proposed measures produce more accurate estimations of relative preference for biased sentences between transformers than others based on our methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social and political scientists often aim to discover and measure distinct
biases from text data representations (embeddings). Innovative
transformer-based language models produce contextually-aware token embeddings
and have achieved state-of-the-art performance for a variety of natural
language tasks, but have been shown to encode unwanted biases for downstream
applications. In this paper, we evaluate the social biases encoded by
transformers trained with the masked language modeling objective using proposed
proxy functions within an iterative masking experiment to measure the quality
of transformer models' predictions, and assess the preference of MLMs towards
disadvantaged and advantaged groups. We compare bias estimations with those
produced by other evaluation methods using two benchmark datasets, finding
relatively high religious and disability biases across considered MLMs and low
gender bias in one dataset relative to the other. Our measures outperform
others in their agreement with human annotators. We extend on previous work by
evaluating social biases introduced after re-training an MLM under the masked
language modeling objective (w.r.t. the model's pre-trained base), and find
that proposed measures produce more accurate estimations of relative preference
for biased sentences between transformers than others based on our methods.
Related papers
- Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender.
This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z) - Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models [47.545382591646565]
Large Language Models (LLMs) have excelled at language understanding and generating human-level text.
LLMs are susceptible to adversarial attacks where malicious users prompt the model to generate undesirable text.
In this work, we train models to automatically create adversarial prompts to elicit biased responses from target LLMs.
arXiv Detail & Related papers (2024-08-07T17:11:34Z) - CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.
To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.
We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - Taxonomy-based CheckList for Large Language Model Evaluation [0.0]
We introduce human knowledge into natural language interventions and study pre-trained language models' (LMs) behaviors.
Inspired by CheckList behavioral testing, we present a checklist-style task that aims to probe and quantify LMs' unethical behaviors.
arXiv Detail & Related papers (2023-12-15T12:58:07Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Constructing Holistic Measures for Social Biases in Masked Language
Models [17.45153670825904]
Masked Language Models (MLMs) have been successful in many natural language processing tasks.
Real-world stereotype biases are likely to be reflected ins due to their learning from large text corpora.
Two evaluation metrics, Kullback Leiblergence Score (KLDivS) and Jensen Shannon Divergence Score (JSDivS) are proposed to evaluate social biases ins.
arXiv Detail & Related papers (2023-05-12T23:09:06Z) - Social Biases in Automatic Evaluation Metrics for NLG [53.76118154594404]
We propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics.
We construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks.
arXiv Detail & Related papers (2022-10-17T08:55:26Z) - BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for
Text Generation [89.41378346080603]
This work presents the first systematic study on the social bias in PLM-based metrics.
We demonstrate that popular PLM-based metrics exhibit significantly higher social bias than traditional metrics on 6 sensitive attributes.
In addition, we develop debiasing adapters that are injected into PLM layers, mitigating bias in PLM-based metrics while retaining high performance for evaluating text generation.
arXiv Detail & Related papers (2022-10-14T08:24:11Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.