Discovering and Interpreting Biased Concepts in Online Communities
- URL: http://arxiv.org/abs/2010.14448v2
- Date: Mon, 24 Jan 2022 23:37:26 GMT
- Title: Discovering and Interpreting Biased Concepts in Online Communities
- Authors: Xavier Ferrer-Aran, Tom van Nuenen, Natalia Criado, Jose M. Such
- Abstract summary: Language carries implicit human biases, functioning both as a reflection and a perpetuation of stereotypes that people carry with them.
ML-based NLP methods such as word embeddings have been shown to learn such language biases with striking accuracy.
This paper improves upon, extends, and evaluates our previous data-driven method to automatically discover and help interpret biased concepts encoded in word embeddings.
- Score: 5.670038395203354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language carries implicit human biases, functioning both as a reflection and
a perpetuation of stereotypes that people carry with them. Recently, ML-based
NLP methods such as word embeddings have been shown to learn such language
biases with striking accuracy. This capability of word embeddings has been
successfully exploited as a tool to quantify and study human biases. However,
previous studies only consider a predefined set of biased concepts to attest
(e.g., whether gender is more or less associated with particular jobs), or just
discover biased words without helping to understand their meaning at the
conceptual level. As such, these approaches can be either unable to find biased
concepts that have not been defined in advance, or the biases they find are
difficult to interpret and study. This could make existing approaches
unsuitable to discover and interpret biases in online communities, as such
communities may carry different biases than those in mainstream culture. This
paper improves upon, extends, and evaluates our previous data-driven method to
automatically discover and help interpret biased concepts encoded in word
embeddings. We apply this approach to study the biased concepts present in the
language used in online communities and experimentally show the validity and
stability of our method
Related papers
- Mitigating Gender Bias in Contextual Word Embeddings [1.208453901299241]
We propose a novel objective function for Lipstick(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings.
We also propose new methods for debiasing static embeddings and provide empirical proof via extensive analysis and experiments.
arXiv Detail & Related papers (2024-11-18T21:36:44Z) - Semantic Properties of cosine based bias scores for word embeddings [48.0753688775574]
We propose requirements for bias scores to be considered meaningful for quantifying biases.
We analyze cosine based scores from the literature with regard to these requirements.
We underline these findings with experiments to show that the bias scores' limitations have an impact in the application case.
arXiv Detail & Related papers (2024-01-27T20:31:10Z) - An Analysis of Social Biases Present in BERT Variants Across Multiple
Languages [0.0]
We investigate the bias present in monolingual BERT models across a diverse set of languages.
We propose a template-based method to measure any kind of bias, based on sentence pseudo-likelihood.
We conclude that current methods of probing for bias are highly language-dependent.
arXiv Detail & Related papers (2022-11-25T23:38:08Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Robustness and Reliability of Gender Bias Assessment in Word Embeddings:
The Role of Base Pairs [23.574442657224008]
It has been shown that word embeddings can exhibit gender bias, and various methods have been proposed to quantify this.
Previous work has leveraged gender word pairs to measure bias and extract biased analogies.
We show that the reliance on these gendered pairs has strong limitations.
In particular, the well-known analogy "man is to computer-programmer as woman is to homemaker" is due to word similarity rather than societal bias.
arXiv Detail & Related papers (2020-10-06T16:09:05Z) - Discovering and Categorising Language Biases in Reddit [5.670038395203354]
This paper proposes a data-driven approach to automatically discover language biases encoded in the vocabulary of online discourse communities on Reddit.
We use word embeddings to transform text into high-dimensional dense vectors and capture semantic relations between words.
We successfully discover gender bias, religion bias, and ethnic bias in different Reddit communities.
arXiv Detail & Related papers (2020-08-06T16:42:10Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.