Mitigating Gender Bias in Captioning Systems
- URL: http://arxiv.org/abs/2006.08315v7
- Date: Tue, 20 Apr 2021 21:48:29 GMT
- Title: Mitigating Gender Bias in Captioning Systems
- Authors: Ruixiang Tang, Mengnan Du, Yuening Li, Zirui Liu, Na Zou, Xia Hu
- Abstract summary: Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
- Score: 56.25457065032423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image captioning has made substantial progress with huge supporting image
collections sourced from the web. However, recent studies have pointed out that
captioning datasets, such as COCO, contain gender bias found in web corpora. As
a result, learning models could heavily rely on the learned priors and image
context for gender identification, leading to incorrect or even offensive
errors. To encourage models to learn correct gender features, we reorganize the
COCO dataset and present two new splits COCO-GB V1 and V2 datasets where the
train and test sets have different gender-context joint distribution. Models
relying on contextual cues will suffer from huge gender prediction errors on
the anti-stereotypical test data. Benchmarking experiments reveal that most
captioning models learn gender bias, leading to high gender prediction errors,
especially for women. To alleviate the unwanted bias, we propose a new Guided
Attention Image Captioning model (GAIC) which provides self-guidance on visual
attention to encourage the model to capture correct gender visual evidence.
Experimental results validate that GAIC can significantly reduce gender
prediction errors with a competitive caption quality. Our codes and the
designed benchmark datasets are available at
https://github.com/datamllab/Mitigating_Gender_Bias_In_Captioning_System.
Related papers
- Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic
Contrast Sets [52.77024349608834]
Vision-language models can perpetuate and amplify societal biases learned during pre-training on uncurated image-text pairs from the internet.
COCO Captions is the most commonly used dataset for evaluating bias between background context and the gender of people in-situ.
We propose a novel dataset debiasing pipeline to augment the COCO dataset with synthetic, gender-balanced contrast sets.
arXiv Detail & Related papers (2023-05-24T17:59:18Z) - Model-Agnostic Gender Debiased Image Captioning [29.640940966944697]
Image captioning models are known to perpetuate and amplify harmful societal bias in the training set.
We propose a framework, called LIBRA, that learns from synthetically biased samples to decrease both types of biases.
arXiv Detail & Related papers (2023-04-07T15:30:49Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution
and Machine Translation [10.542861450223128]
We find grammatical patterns indicating stereotypical and non-stereotypical gender-role assignments in corpora from three domains.
We manually verify the quality of our corpus and use it to evaluate gender bias in various coreference resolution and machine translation models.
arXiv Detail & Related papers (2021-09-08T18:14:11Z) - First the worst: Finding better gender translations during beam search [19.921216907778447]
We focus on gender bias resulting from systematic errors in grammatical gender translation.
We experiment with reranking nbest lists using gender features obtained automatically from the source sentence.
We find that a combination of these techniques allows large gains in WinoMT accuracy without requiring additional bilingual data or an additional NMT model.
arXiv Detail & Related papers (2021-04-15T12:53:30Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.