WordBias: An Interactive Visual Tool for Discovering Intersectional
Biases Encoded in Word Embeddings
- URL: http://arxiv.org/abs/2103.03598v1
- Date: Fri, 5 Mar 2021 11:04:35 GMT
- Title: WordBias: An Interactive Visual Tool for Discovering Intersectional
Biases Encoded in Word Embeddings
- Authors: Bhavya Ghai, Md Naimul Hoque, Klaus Mueller
- Abstract summary: We present WordBias, an interactive visual tool designed to explore biases against intersectional groups encoded in word embeddings.
Given a pretrained static word embedding, WordBias computes the association of each word along different groups based on race, age, etc.
- Score: 39.87681037622605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intersectional bias is a bias caused by an overlap of multiple social factors
like gender, sexuality, race, disability, religion, etc. A recent study has
shown that word embedding models can be laden with biases against
intersectional groups like African American females, etc. The first step
towards tackling such intersectional biases is to identify them. However,
discovering biases against different intersectional groups remains a
challenging task. In this work, we present WordBias, an interactive visual tool
designed to explore biases against intersectional groups encoded in static word
embeddings. Given a pretrained static word embedding, WordBias computes the
association of each word along different groups based on race, age, etc. and
then visualizes them using a novel interactive interface. Using a case study,
we demonstrate how WordBias can help uncover biases against intersectional
groups like Black Muslim Males, Poor Females, etc. encoded in word embedding.
In addition, we also evaluate our tool using qualitative feedback from expert
interviews. The source code for this tool can be publicly accessed for
reproducibility at github.com/bhavyaghai/WordBias.
Related papers
- What Do Llamas Really Think? Revealing Preference Biases in Language
Model Representations [62.91799637259657]
Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond?
We study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations.
We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors.
arXiv Detail & Related papers (2023-11-30T18:53:13Z) - Discovering and Mitigating Visual Biases through Keyword Explanation [66.71792624377069]
We propose the Bias-to-Text (B2T) framework, which interprets visual biases as keywords.
B2T can identify known biases, such as gender bias in CelebA, background bias in Waterbirds, and distribution shifts in ImageNet-R/C.
B2T uncovers novel biases in larger datasets, such as Dollar Street and ImageNet.
arXiv Detail & Related papers (2023-01-26T13:58:46Z) - Debiasing Word Embeddings with Nonlinear Geometry [37.88933175338274]
This work studies biases associated with multiple social categories.
Individual biases intersect non-trivially over a one-dimensional subspace.
We then construct an intersectional subspace to debias for multiple social categories using the nonlinear geometry of individual biases.
arXiv Detail & Related papers (2022-08-29T21:40:27Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework
Unifying Hybrid Granularities for Text Generation [59.01297461453444]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - Interactive Re-Fitting as a Technique for Improving Word Embeddings [0.0]
We make it possible for humans to adjust portions of a word embedding space by moving sets of words closer to one another.
Our approach allows users to trigger selective post-processing as they interact with and assess potential bias in word embeddings.
arXiv Detail & Related papers (2020-09-30T21:54:22Z) - Discovering and Categorising Language Biases in Reddit [5.670038395203354]
This paper proposes a data-driven approach to automatically discover language biases encoded in the vocabulary of online discourse communities on Reddit.
We use word embeddings to transform text into high-dimensional dense vectors and capture semantic relations between words.
We successfully discover gender bias, religion bias, and ethnic bias in different Reddit communities.
arXiv Detail & Related papers (2020-08-06T16:42:10Z) - Detecting Emergent Intersectional Biases: Contextualized Word Embeddings
Contain a Distribution of Human-like Biases [10.713568409205077]
State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears.
We introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models.
We develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings.
arXiv Detail & Related papers (2020-06-06T19:49:50Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Joint Multiclass Debiasing of Word Embeddings [5.1135133995376085]
We present a joint multiclass debiasing approach capable of debiasing multiple bias dimensions simultaneously.
We show that our concepts can both reduce or even completely eliminate bias, while maintaining meaningful relationships between vectors in word embeddings.
arXiv Detail & Related papers (2020-03-09T22:06:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.