Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic
Contrast Sets
- URL: http://arxiv.org/abs/2305.15407v1
- Date: Wed, 24 May 2023 17:59:18 GMT
- Title: Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic
Contrast Sets
- Authors: Brandon Smith, Miguel Farinha, Siobhan Mackenzie Hall, Hannah Rose
Kirk, Aleksandar Shtedritski, Max Bain
- Abstract summary: Vision-language models can perpetuate and amplify societal biases learned during pre-training on uncurated image-text pairs from the internet.
COCO Captions is the most commonly used dataset for evaluating bias between background context and the gender of people in-situ.
We propose a novel dataset debiasing pipeline to augment the COCO dataset with synthetic, gender-balanced contrast sets.
- Score: 52.77024349608834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-language models are growing in popularity and public visibility to
generate, edit, and caption images at scale; but their outputs can perpetuate
and amplify societal biases learned during pre-training on uncurated image-text
pairs from the internet. Although debiasing methods have been proposed, we
argue that these measurements of model bias lack validity due to dataset bias.
We demonstrate there are spurious correlations in COCO Captions, the most
commonly used dataset for evaluating bias, between background context and the
gender of people in-situ. This is problematic because commonly-used bias
metrics (such as Bias@K) rely on per-gender base rates. To address this issue,
we propose a novel dataset debiasing pipeline to augment the COCO dataset with
synthetic, gender-balanced contrast sets, where only the gender of the subject
is edited and the background is fixed. However, existing image editing methods
have limitations and sometimes produce low-quality images; so, we introduce a
method to automatically filter the generated images based on their similarity
to real images. Using our balanced synthetic contrast sets, we benchmark bias
in multiple CLIP-based models, demonstrating how metrics are skewed by
imbalance in the original COCO images. Our results indicate that the proposed
approach improves the validity of the evaluation, ultimately contributing to
more realistic understanding of bias in vision-language models.
Related papers
- GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - Mitigating Test-Time Bias for Fair Image Retrieval [18.349154934096784]
We address the challenge of generating fair and unbiased image retrieval results given neutral textual queries.
We introduce a straightforward technique, Post-hoc Bias Mitigation, that post-processes the outputs from the pre-trained vision-language model.
Our approach achieves the lowest bias, compared with various existing bias-mitigation methods, in text-based image retrieval result.
arXiv Detail & Related papers (2023-05-23T21:31:16Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - To Find Waldo You Need Contextual Cues: Debiasing Who's Waldo [53.370023611101175]
We present a debiased dataset for the Person-centric Visual Grounding task first proposed by Cui et al.
Given an image and a caption, PCVG requires pairing up a person's name mentioned in a caption with a bounding box that points to the person in the image.
We find that the original Who's Waldo dataset contains a large number of biased samples that are solvable simply by methods.
arXiv Detail & Related papers (2022-03-30T21:35:53Z) - BiaSwap: Removing dataset bias with bias-tailored swapping augmentation [20.149645246997668]
Deep neural networks often make decisions based on the spurious correlations inherent in the dataset, failing to generalize in an unbiased data distribution.
This paper proposes a novel bias-tailored augmentation-based approach, BiaSwap, for learning debiased representation without requiring supervision on the bias type.
arXiv Detail & Related papers (2021-08-23T08:35:26Z) - Unravelling the Effect of Image Distortions for Biased Prediction of
Pre-trained Face Recognition Models [86.79402670904338]
We evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions.
We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.
arXiv Detail & Related papers (2021-08-14T16:49:05Z) - Evaluating and Mitigating Bias in Image Classifiers: A Causal
Perspective Using Counterfactuals [27.539001365348906]
We present a method for generating counterfactuals by incorporating a structural causal model (SCM) in an improved variant of Adversarially Learned Inference (ALI)
We show how to explain a pre-trained machine learning classifier, evaluate its bias, and mitigate the bias using a counterfactual regularizer.
arXiv Detail & Related papers (2020-09-17T13:19:31Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.