Trustworthy Social Bias Measurement
- URL: http://arxiv.org/abs/2212.11672v1
- Date: Tue, 20 Dec 2022 18:45:12 GMT
- Title: Trustworthy Social Bias Measurement
- Authors: Rishi Bommasani, Percy Liang
- Abstract summary: In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling.
We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures.
We demonstrate considerable evidence to trust our measures, showing they overcome conceptual, technical, and empirical deficiencies present in prior measures.
- Score: 92.87080873893618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How do we design measures of social bias that we trust? While prior work has
introduced several measures, no measure has gained widespread trust: instead,
mounting evidence argues we should distrust these measures. In this work, we
design bias measures that warrant trust based on the cross-disciplinary theory
of measurement modeling. To combat the frequently fuzzy treatment of social
bias in NLP, we explicitly define social bias, grounded in principles drawn
from social science research. We operationalize our definition by proposing a
general bias measurement framework DivDist, which we use to instantiate 5
concrete bias measures. To validate our measures, we propose a rigorous testing
protocol with 8 testing criteria (e.g. predictive validity: do measures predict
biases in US employment?). Through our testing, we demonstrate considerable
evidence to trust our measures, showing they overcome conceptual, technical,
and empirical deficiencies present in prior measures.
Related papers
- Reconciling Predictive and Statistical Parity: A Causal Approach [68.59381759875734]
We propose a new causal decomposition formula for the fairness measures associated with predictive parity.
We show that the notions of statistical and predictive parity are not really mutually exclusive, but complementary and spanning a spectrum of fairness notions.
arXiv Detail & Related papers (2023-06-08T09:23:22Z) - Testing Occupational Gender Bias in Language Models: Towards Robust Measurement and Zero-Shot Debiasing [98.07536837448293]
Large language models (LLMs) have been shown to exhibit a variety of harmful, human-like biases against various demographics.
We introduce a list of desiderata for robustly measuring biases in generative language models.
We then use this benchmark to test several state-of-the-art open-source LLMs, including Llama, Mistral, and their instruction-tuned versions.
arXiv Detail & Related papers (2022-12-20T22:41:24Z) - Undesirable Biases in NLP: Addressing Challenges of Measurement [1.7126708168238125]
We provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics.
We will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools.
Our goal is to provide NLP practitioners with methodological tools for designing better bias measures.
arXiv Detail & Related papers (2022-11-24T16:53:18Z) - The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
Benchmarks [75.58692290694452]
We compare social biases with non-social biases stemming from choices made during dataset construction that might not even be discernible to the human eye.
We observe that these shallow modifications have a surprising effect on the resulting degree of bias across various models.
arXiv Detail & Related papers (2022-10-18T17:58:39Z) - Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and
their Social Biases in Downstream Tasks [33.044775876807826]
We study intrinsic relationship between task-agnostic and task-specific extrinsic social bias evaluation measures for Masked Language Models (MLMs)
We find that there exists only a weak correlation between these two types of evaluation measures.
arXiv Detail & Related papers (2022-10-06T14:08:57Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - Evaluating Metrics for Bias in Word Embeddings [64.55554083622258]
We formalize a bias definition based on the ideas from previous works and derive conditions for bias metrics.
We propose a new metric, SAME, to address the shortcomings of existing metrics and mathematically prove that SAME behaves appropriately.
arXiv Detail & Related papers (2021-11-15T16:07:15Z) - Assessing the Reliability of Word Embedding Gender Bias Measures [4.258396452892244]
We assess three types of reliability of word embedding gender bias measures, namely test-retest reliability, inter-rater consistency and internal consistency.
Our findings inform better design of word embedding gender bias measures.
arXiv Detail & Related papers (2021-09-10T08:23:50Z) - What do Bias Measures Measure? [41.36968251743058]
Natural Language Processing models propagate social biases about protected attributes such as gender, race, and nationality.
To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases.
This work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms.
arXiv Detail & Related papers (2021-08-07T04:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.