Related papers: Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned

Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned

URL: http://arxiv.org/abs/2409.09260v1
Date: Sat, 14 Sep 2024 02:13:56 GMT
Title: Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned
Authors: Taisei Katô, Yusuke Miyao,
Abstract summary: We examine the abilities of intrinsic bias metrics of static word embeddings to predict whether Natural Language Processing (NLP) systems exhibit biased behavior. A word embedding is one of the fundamental NLP technologies that represents the meanings of words through real vectors, and problematically, it also learns social biases such as stereotypes.
Score: 8.673018064714547
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We examine the abilities of intrinsic bias metrics of static word embeddings to predict whether Natural Language Processing (NLP) systems exhibit biased behavior. A word embedding is one of the fundamental NLP technologies that represents the meanings of words through real vectors, and problematically, it also learns social biases such as stereotypes. An intrinsic bias metric measures bias by examining a characteristic of vectors, while an extrinsic bias metric checks whether an NLP system trained with a word embedding is biased. A previous study found that a common intrinsic bias metric usually does not correlate with extrinsic bias metrics. However, the intrinsic and extrinsic bias metrics did not measure the same bias in most cases, which makes us question whether the lack of correlation is genuine. In this paper, we extract characteristic words from datasets of extrinsic bias metrics and analyze correlations with intrinsic bias metrics with those words to ensure both metrics measure the same bias. We observed moderate to high correlations with some extrinsic bias metrics but little to no correlations with the others. This result suggests that intrinsic bias metrics can predict biased behavior in particular settings but not in others. Experiment codes are available at GitHub.

Related papers

A Causal Information-Flow Framework for Unbiased Learning-to-Rank [52.54102347581931]
In web search and recommendation systems, user clicks are widely used to train ranking models.<n>We introduce a novel causal learning-based ranking framework that extends Unbiased Learning-to-Rank.<n>Our method consistently reduces measured bias leakage and improves ranking performance.
arXiv Detail & Related papers (2026-01-09T07:19:35Z)
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation [49.3814117521631]
Standard benchmarks of bias and fairness in large language models (LLMs) measure the association between social attributes implied in user prompts and short responses. We develop analogous RUTEd evaluations from three contexts of real-world use. We find that standard bias metrics have no significant correlation with the more realistic bias metrics.
arXiv Detail & Related papers (2024-02-20T01:49:15Z)
Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint. We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b. We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z)
How Gender Debiasing Affects Internal Model Representations, and Why It Matters [26.993273464725995]
We show that intrinsic bias is better indicator of debiasing than the standard WEAT metric. Our framework provides a comprehensive perspective on bias in NLP models, which can be applied to deploy NLP systems in a more informed manner.
arXiv Detail & Related papers (2022-04-14T08:54:15Z)
The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings. We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z)
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations [74.70957445600936]
Multiple metrics have been introduced to measure fairness in various natural language processing tasks. These metrics can be roughly categorized into two categories: 1) emphextrinsic metrics for evaluating fairness in downstream applications and 2) emphintrinsic metrics for estimating fairness in upstream language representation models.
arXiv Detail & Related papers (2022-03-25T22:17:43Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Evaluating Metrics for Bias in Word Embeddings [44.14639209617701]
We formalize a bias definition based on the ideas from previous works and derive conditions for bias metrics. We propose a new metric, SAME, to address the shortcomings of existing metrics and mathematically prove that SAME behaves appropriately.
arXiv Detail & Related papers (2021-11-15T16:07:15Z)
Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
Intrinsic Bias Metrics Do Not Correlate with Application Bias [12.588713044749179]
This research examines whether easy-to-measure intrinsic metrics correlate well to real world extrinsic metrics. We measure both intrinsic and extrinsic bias across hundreds of trained models covering different tasks and experimental conditions. We advise that efforts to debias embedding spaces be always also paired with measurement of downstream model bias, and suggest that that community increase effort into making downstream measurement more feasible via creation of additional challenge sets and annotated test data.
arXiv Detail & Related papers (2020-12-31T18:59:44Z)
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases [10.713568409205077]
State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. We introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models. We develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings.
arXiv Detail & Related papers (2020-06-06T19:49:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.