Trustworthy Social Bias Measurement
- URL: http://arxiv.org/abs/2212.11672v1
- Date: Tue, 20 Dec 2022 18:45:12 GMT
- Title: Trustworthy Social Bias Measurement
- Authors: Rishi Bommasani, Percy Liang
- Abstract summary: In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling.
We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures.
We demonstrate considerable evidence to trust our measures, showing they overcome conceptual, technical, and empirical deficiencies present in prior measures.
- Score: 92.87080873893618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How do we design measures of social bias that we trust? While prior work has
introduced several measures, no measure has gained widespread trust: instead,
mounting evidence argues we should distrust these measures. In this work, we
design bias measures that warrant trust based on the cross-disciplinary theory
of measurement modeling. To combat the frequently fuzzy treatment of social
bias in NLP, we explicitly define social bias, grounded in principles drawn
from social science research. We operationalize our definition by proposing a
general bias measurement framework DivDist, which we use to instantiate 5
concrete bias measures. To validate our measures, we propose a rigorous testing
protocol with 8 testing criteria (e.g. predictive validity: do measures predict
biases in US employment?). Through our testing, we demonstrate considerable
evidence to trust our measures, showing they overcome conceptual, technical,
and empirical deficiencies present in prior measures.
Related papers
- Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy.
As predictions shape collective outcomes, social welfare arises naturally as a metric of concern.
We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z) - A Principled Approach for a New Bias Measure [7.352247786388098]
We propose the definition of Uniform Bias (UB), the first bias measure with a clear and simple interpretation in the full range of bias values.
Our results are experimentally validated using nine publicly available datasets and theoretically analyzed, which provide novel insights about the problem.
Based on our approach, we also design a bias mitigation model that might be useful to policymakers.
arXiv Detail & Related papers (2024-05-20T18:14:33Z) - The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
Benchmarks [75.58692290694452]
We compare social biases with non-social biases stemming from choices made during dataset construction that might not even be discernible to the human eye.
We observe that these shallow modifications have a surprising effect on the resulting degree of bias across various models.
arXiv Detail & Related papers (2022-10-18T17:58:39Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - Evaluating Metrics for Bias in Word Embeddings [44.14639209617701]
We formalize a bias definition based on the ideas from previous works and derive conditions for bias metrics.
We propose a new metric, SAME, to address the shortcomings of existing metrics and mathematically prove that SAME behaves appropriately.
arXiv Detail & Related papers (2021-11-15T16:07:15Z) - Assessing the Reliability of Word Embedding Gender Bias Measures [4.258396452892244]
We assess three types of reliability of word embedding gender bias measures, namely test-retest reliability, inter-rater consistency and internal consistency.
Our findings inform better design of word embedding gender bias measures.
arXiv Detail & Related papers (2021-09-10T08:23:50Z) - What do Bias Measures Measure? [41.36968251743058]
Natural Language Processing models propagate social biases about protected attributes such as gender, race, and nationality.
To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases.
This work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms.
arXiv Detail & Related papers (2021-08-07T04:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.