On Debiasing Text Embeddings Through Context Injection
- URL: http://arxiv.org/abs/2410.12874v2
- Date: Fri, 18 Oct 2024 15:26:55 GMT
- Title: On Debiasing Text Embeddings Through Context Injection
- Authors: Thomas Uriot,
- Abstract summary: We conduct a review of 19 embedding models by quantifying their biases and how well they respond to context injection.
We show that higher performing models are more prone to capturing biases, but are also better at incorporating context.
In a retrieval task, we show that biases in embeddings can lead to non-desirable outcomes.
- Score: 0.0
- License:
- Abstract: Current advances in Natural Language Processing (NLP) have made it increasingly feasible to build applications leveraging textual data. Generally, the core of these applications rely on having a good semantic representation of text into vectors, via embedding models. However, it has been shown that these embeddings capture and perpetuate biases already present in text. While a few techniques have been proposed to debias embeddings, they do not take advantage of the recent advances in context understanding of modern embedding models. In this paper, we fill this gap by conducting a review of 19 embedding models by quantifying their biases and how well they respond to context injection as a mean of debiasing. We show that higher performing models are more prone to capturing biases, but are also better at incorporating context. Surprisingly, we find that while models can easily embed affirmative semantics, they fail at embedding neutral semantics. Finally, in a retrieval task, we show that biases in embeddings can lead to non-desirable outcomes. We use our new-found insights to design a simple algorithm for top $k$ retrieval, where $k$ is dynamically selected. We show that our algorithm is able to retrieve all relevant gendered and neutral chunks.
Related papers
- Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.
We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - From Lists to Emojis: How Format Bias Affects Model Alignment [67.08430328350327]
We study format biases in reinforcement learning from human feedback.
Many widely-used preference models, including human evaluators, exhibit strong biases towards specific format patterns.
We show that with a small amount of biased data, we can inject significant bias into the reward model.
arXiv Detail & Related papers (2024-09-18T05:13:18Z) - From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings [2.9324535682810886]
DeepSoftDebias is an algorithm that uses a neural network to perform'soft debiasing'
We exhaustively evaluate this algorithm across a variety of SOTA datasets, accuracy metrics, and challenging NLP tasks.
We find that DeepSoftDebias outperforms the current state-of-the-art methods at reducing bias across gender, race, and religion.
arXiv Detail & Related papers (2024-02-18T08:53:41Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Efficient and Flexible Topic Modeling using Pretrained Embeddings and
Bag of Sentences [1.8592384822257952]
We propose a novel topic modeling and inference algorithm.
We leverage pre-trained sentence embeddings by combining generative process models and clustering.
TheTailor evaluation shows that our method yields state-of-the art results with relatively little computational demands.
arXiv Detail & Related papers (2023-02-06T20:13:11Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Entity Disambiguation with Entity Definitions [50.01142092276296]
Local models have recently attained astounding performances in Entity Disambiguation (ED)
Previous works limited their studies to using, as the textual representation of each candidate, only its Wikipedia title.
In this paper, we address this limitation and investigate to what extent more expressive textual representations can mitigate it.
We report a new state of the art on 2 out of 6 benchmarks we consider and strongly improve the generalization capability over unseen patterns.
arXiv Detail & Related papers (2022-10-11T17:46:28Z) - The Authors Matter: Understanding and Mitigating Implicit Bias in Deep
Text Classification [36.361778457307636]
Deep text classification models can produce biased outcomes for texts written by authors of certain demographic groups.
In this paper, we first demonstrate that implicit bias exists in different text classification tasks for different demographic groups.
We then build a learning-based interpretation method to deepen our knowledge of implicit bias.
arXiv Detail & Related papers (2021-05-06T16:17:38Z) - Debiasing Pre-trained Contextualised Embeddings [28.378270372391498]
We propose a fine-tuning method that can be applied at token- or sentence-levels to debias pre-trained contextualised embeddings.
Using gender bias as an illustrative example, we then conduct a systematic study using several state-of-the-art (SoTA) contextualised representations.
We find that applying token-level debiasing for all tokens and across all layers of a contextualised embedding model produces the best performance.
arXiv Detail & Related papers (2021-01-23T15:28:48Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.