Measuring Fairness with Biased Rulers: A Survey on Quantifying Biases in
Pretrained Language Models
- URL: http://arxiv.org/abs/2112.07447v1
- Date: Tue, 14 Dec 2021 15:04:56 GMT
- Title: Measuring Fairness with Biased Rulers: A Survey on Quantifying Biases in
Pretrained Language Models
- Authors: Pieter Delobelle, Ewoenam Kwaku Tokpo, Toon Calders, Bettina Berendt
- Abstract summary: An increasing awareness of biased patterns in natural language processing resources has motivated many metrics to quantify bias' and fairness'
We survey the existing literature on fairness metrics for pretrained language models and experimentally evaluate compatibility.
We find that many metrics are not compatible and highly depend on (i) templates, (ii) attribute and target seeds and (iii) the choice of embeddings.
- Score: 2.567384209291337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An increasing awareness of biased patterns in natural language processing
resources, like BERT, has motivated many metrics to quantify `bias' and
`fairness'. But comparing the results of different metrics and the works that
evaluate with such metrics remains difficult, if not outright impossible. We
survey the existing literature on fairness metrics for pretrained language
models and experimentally evaluate compatibility, including both biases in
language models as in their downstream tasks. We do this by a mixture of
traditional literature survey and correlation analysis, as well as by running
empirical evaluations. We find that many metrics are not compatible and highly
depend on (i) templates, (ii) attribute and target seeds and (iii) the choice
of embeddings. These results indicate that fairness or bias evaluation remains
challenging for contextualized language models, if not at least highly
subjective. To improve future comparisons and fairness evaluations, we
recommend avoiding embedding-based metrics and focusing on fairness evaluations
in downstream tasks.
Related papers
- COBIAS: Contextual Reliability in Bias Assessment [14.594920595573038]
Large Language Models (LLMs) are trained on extensive web corpora, which enable them to understand and generate human-like text.
These biases arise from web data's diverse and often uncurated nature, containing various stereotypes and prejudices.
We propose understanding the context of inputs by considering the diverse situations in which they may arise.
arXiv Detail & Related papers (2024-02-22T10:46:11Z) - Semantic Properties of cosine based bias scores for word embeddings [52.13994416317707]
We propose requirements for bias scores to be considered meaningful for quantifying biases.
We analyze cosine based scores from the literature with regard to these requirements.
We underline these findings with experiments to show that the bias scores' limitations have an impact in the application case.
arXiv Detail & Related papers (2024-01-27T20:31:10Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language
Models [12.214260053244871]
We analyse the body of work that uses prompts and templates to assess bias in language models.
We draw on a measurement modelling framework to create a taxonomy of attributes that capture what a bias test aims to measure.
Our analysis illuminates the scope of possible bias types the field is able to measure, and reveals types that are as yet under-researched.
arXiv Detail & Related papers (2023-05-22T06:28:48Z) - Testing Occupational Gender Bias in Language Models: Towards Robust Measurement and Zero-Shot Debiasing [98.07536837448293]
Large language models (LLMs) have been shown to exhibit a variety of harmful, human-like biases against various demographics.
We introduce a list of desiderata for robustly measuring biases in generative language models.
We then use this benchmark to test several state-of-the-art open-source LLMs, including Llama, Mistral, and their instruction-tuned versions.
arXiv Detail & Related papers (2022-12-20T22:41:24Z) - Mind Your Bias: A Critical Review of Bias Detection Methods for
Contextual Language Models [2.170169149901781]
We conduct a rigorous analysis and comparison of bias detection methods for contextual language models.
Our results show that minor design and implementation decisions (or errors) have a substantial and often significant impact on the derived bias scores.
arXiv Detail & Related papers (2022-11-15T19:27:54Z) - On the Intrinsic and Extrinsic Fairness Evaluation Metrics for
Contextualized Language Representations [74.70957445600936]
Multiple metrics have been introduced to measure fairness in various natural language processing tasks.
These metrics can be roughly categorized into two categories: 1) emphextrinsic metrics for evaluating fairness in downstream applications and 2) emphintrinsic metrics for estimating fairness in upstream language representation models.
arXiv Detail & Related papers (2022-03-25T22:17:43Z) - Evaluating Metrics for Bias in Word Embeddings [64.55554083622258]
We formalize a bias definition based on the ideas from previous works and derive conditions for bias metrics.
We propose a new metric, SAME, to address the shortcomings of existing metrics and mathematically prove that SAME behaves appropriately.
arXiv Detail & Related papers (2021-11-15T16:07:15Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.