Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and
their Social Biases in Downstream Tasks
- URL: http://arxiv.org/abs/2210.02938v1
- Date: Thu, 6 Oct 2022 14:08:57 GMT
- Title: Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and
their Social Biases in Downstream Tasks
- Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki
- Abstract summary: We study intrinsic relationship between task-agnostic and task-specific extrinsic social bias evaluation measures for Masked Language Models (MLMs)
We find that there exists only a weak correlation between these two types of evaluation measures.
- Score: 33.044775876807826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the relationship between task-agnostic intrinsic and task-specific
extrinsic social bias evaluation measures for Masked Language Models (MLMs),
and find that there exists only a weak correlation between these two types of
evaluation measures. Moreover, we find that MLMs debiased using different
methods still re-learn social biases during fine-tuning on downstream tasks. We
identify the social biases in both training instances as well as their assigned
labels as reasons for the discrepancy between intrinsic and extrinsic bias
evaluation measurements. Overall, our findings highlight the limitations of
existing MLM bias evaluation measures and raise concerns on the deployment of
MLMs in downstream applications using those measures.
Related papers
- CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.
To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.
We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - The African Woman is Rhythmic and Soulful: Evaluation of Open-ended Generation for Implicit Biases [0.0]
This study investigates the subtle and often concealed biases present in Large Language Models (LLMs)
The challenge of measuring such biases is exacerbated as LLMs become increasingly proprietary.
This study introduces innovative measures of bias inspired by psychological methodologies.
arXiv Detail & Related papers (2024-07-01T13:21:33Z) - Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators [48.54465599914978]
Large Language Models (LLMs) have demonstrated promising capabilities in assessing the quality of generated natural language.
LLMs still exhibit biases in evaluation and often struggle to generate coherent evaluations that align with human assessments.
We introduce Pairwise-preference Search (PairS), an uncertainty-guided search method that employs LLMs to conduct pairwise comparisons and efficiently ranks candidate texts.
arXiv Detail & Related papers (2024-03-25T17:11:28Z) - Measuring Social Biases in Masked Language Models by Proxy of Prediction
Quality [0.0]
Social political scientists often aim to discover and measure distinct biases from text data representations (embeddings)
In this paper, we evaluate the social biases encoded by transformers trained with a masked language modeling objective.
We find that proposed measures produce more accurate estimations of relative preference for biased sentences between transformers than others based on our methods.
arXiv Detail & Related papers (2024-02-21T17:33:13Z) - Measuring Implicit Bias in Explicitly Unbiased Large Language Models [14.279977138893846]
Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases.
We introduce two new measures of bias: LLM Implicit Bias, a prompt-based method for revealing implicit bias; and LLM Decision Bias, a strategy to detect subtle discrimination in decision-making tasks.
Using these measures, we found pervasive stereotype biases mirroring those in society in 8 value-aligned models across 4 social categories.
arXiv Detail & Related papers (2024-02-06T15:59:23Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Constructing Holistic Measures for Social Biases in Masked Language
Models [17.45153670825904]
Masked Language Models (MLMs) have been successful in many natural language processing tasks.
Real-world stereotype biases are likely to be reflected ins due to their learning from large text corpora.
Two evaluation metrics, Kullback Leiblergence Score (KLDivS) and Jensen Shannon Divergence Score (JSDivS) are proposed to evaluate social biases ins.
arXiv Detail & Related papers (2023-05-12T23:09:06Z) - BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for
Text Generation [89.41378346080603]
This work presents the first systematic study on the social bias in PLM-based metrics.
We demonstrate that popular PLM-based metrics exhibit significantly higher social bias than traditional metrics on 6 sensitive attributes.
In addition, we develop debiasing adapters that are injected into PLM layers, mitigating bias in PLM-based metrics while retaining high performance for evaluating text generation.
arXiv Detail & Related papers (2022-10-14T08:24:11Z) - What do Bias Measures Measure? [41.36968251743058]
Natural Language Processing models propagate social biases about protected attributes such as gender, race, and nationality.
To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases.
This work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms.
arXiv Detail & Related papers (2021-08-07T04:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.