Leveraging Diffusion Perturbations for Measuring Fairness in Computer
Vision
- URL: http://arxiv.org/abs/2311.15108v2
- Date: Sun, 11 Feb 2024 06:07:19 GMT
- Title: Leveraging Diffusion Perturbations for Measuring Fairness in Computer
Vision
- Authors: Nicholas Lui, Bryan Chia, William Berrios, Candace Ross, Douwe Kiela
- Abstract summary: We demonstrate that diffusion models can be leveraged to create such a dataset.
We benchmark several vision-language models on a multi-class occupation classification task.
We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels.
- Score: 25.414154497482162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer vision models have been known to encode harmful biases, leading to
the potentially unfair treatment of historically marginalized groups, such as
people of color. However, there remains a lack of datasets balanced along
demographic traits that can be used to evaluate the downstream fairness of
these models. In this work, we demonstrate that diffusion models can be
leveraged to create such a dataset. We first use a diffusion model to generate
a large set of images depicting various occupations. Subsequently, each image
is edited using inpainting to generate multiple variants, where each variant
refers to a different perceived race. Using this dataset, we benchmark several
vision-language models on a multi-class occupation classification task. We find
that images generated with non-Caucasian labels have a significantly higher
occupation misclassification rate than images generated with Caucasian labels,
and that several misclassifications are suggestive of racial biases. We measure
a model's downstream fairness by computing the standard deviation in the
probability of predicting the true occupation label across the different
perceived identity groups. Using this fairness metric, we find significant
disparities between the evaluated vision-and-language models. We hope that our
work demonstrates the potential value of diffusion methods for fairness
evaluations.
Related papers
- Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
Different from existing forgery detection datasets, the self-construct FairFD dataset contains a balanced racial ratio and diverse forgery generation images with the largest-scale subjects.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers? [39.31679737754048]
We show that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal.
Our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components.
arXiv Detail & Related papers (2024-05-28T10:25:06Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Improving Fairness using Vision-Language Driven Image Augmentation [60.428157003498995]
Fairness is crucial when training a deep-learning discriminative model, especially in the facial domain.
Models tend to correlate specific characteristics (such as age and skin color) with unrelated attributes (downstream tasks)
This paper proposes a method to mitigate these correlations to improve fairness.
arXiv Detail & Related papers (2023-11-02T19:51:10Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - FACET: Fairness in Computer Vision Evaluation Benchmark [21.862644380063756]
Computer vision models have known performance disparities across attributes such as gender and skin tone.
We present a new benchmark named FACET (FAirness in Computer Vision EvaluaTion)
FACET is a large, publicly available evaluation set of 32k images for some of the most common vision tasks.
arXiv Detail & Related papers (2023-08-31T17:59:48Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - DeAR: Debiasing Vision-Language Models with Additive Residuals [5.672132510411465]
Large pre-trained vision-language models (VLMs) provide rich, adaptable image and text representations.
These models suffer from societal biases owing to the skewed distribution of various identity groups in the training data.
We present DeAR, a novel debiasing method that learns additive residual image representations to offset the original representations.
arXiv Detail & Related papers (2023-03-18T14:57:43Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Understanding Gender and Racial Disparities in Image Recognition Models [0.0]
We investigate a multi-label softmax loss with cross-entropy as the loss function instead of a binary cross-entropy on a multi-label classification problem.
We use the MR2 dataset to evaluate the fairness in the model outcomes and try to interpret the mistakes by looking at model activations and suggest possible fixes.
arXiv Detail & Related papers (2021-07-20T01:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.