Gender Artifacts in Visual Datasets
- URL: http://arxiv.org/abs/2206.09191v3
- Date: Mon, 18 Sep 2023 02:25:41 GMT
- Title: Gender Artifacts in Visual Datasets
- Authors: Nicole Meister, Dora Zhao, Angelina Wang, Vikram V. Ramaswamy, Ruth
Fong, Olga Russakovsky
- Abstract summary: We investigate what $textitgender artifacts$ exist within large-scale visual datasets.
We find that gender artifacts are ubiquitous in the COCO and OpenImages datasets.
We claim that attempts to remove gender artifacts from such datasets are largely infeasible.
- Score: 34.74191865400569
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gender biases are known to exist within large-scale visual datasets and can
be reflected or even amplified in downstream models. Many prior works have
proposed methods for mitigating gender biases, often by attempting to remove
gender expression information from images. To understand the feasibility and
practicality of these approaches, we investigate what $\textit{gender
artifacts}$ exist within large-scale visual datasets. We define a
$\textit{gender artifact}$ as a visual cue that is correlated with gender,
focusing specifically on those cues that are learnable by a modern image
classifier and have an interpretable human corollary. Through our analyses, we
find that gender artifacts are ubiquitous in the COCO and OpenImages datasets,
occurring everywhere from low-level information (e.g., the mean value of the
color channels) to the higher-level composition of the image (e.g., pose and
location of people). Given the prevalence of gender artifacts, we claim that
attempts to remove gender artifacts from such datasets are largely infeasible.
Instead, the responsibility lies with researchers and practitioners to be aware
that the distribution of images within datasets is highly gendered and hence
develop methods which are robust to these distributional shifts across groups.
Related papers
- GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Stable Diffusion Exposed: Gender Bias from Prompt to Image [25.702257177921048]
This paper introduces an evaluation protocol that analyzes the impact of gender indicators at every step of the generation process on Stable Diffusion images.
Our findings include the existence of differences in the depiction of objects, such as instruments tailored for specific genders, and shifts in overall layouts.
arXiv Detail & Related papers (2023-12-05T10:12:59Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - Gender Stereotyping Impact in Facial Expression Recognition [1.5340540198612824]
In recent years, machine learning-based models have become the most popular approach to Facial Expression Recognition (FER)
In publicly available FER datasets, apparent gender representation is usually mostly balanced, but their representation in the individual label is not.
We generate derivative datasets with different amounts of stereotypical bias by altering the gender proportions of certain labels.
We observe a discrepancy in the recognition of certain emotions between genders of up to $29 %$ under the worst bias conditions.
arXiv Detail & Related papers (2022-10-11T10:52:23Z) - Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias
in Image Search [8.730027941735804]
We study a unique gender bias in image search.
The search images are often gender-imbalanced for gender-neutral natural language queries.
We introduce two novel debiasing approaches.
arXiv Detail & Related papers (2021-09-12T04:47:33Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.