Multi-Dimensional Gender Bias Classification
- URL: http://arxiv.org/abs/2005.00614v1
- Date: Fri, 1 May 2020 21:23:20 GMT
- Title: Multi-Dimensional Gender Bias Classification
- Authors: Emily Dinan, Angela Fan, Ledell Wu, Jason Weston, Douwe Kiela, Adina
Williams
- Abstract summary: Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
- Score: 67.65551687580552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are trained to find patterns in data. NLP models can
inadvertently learn socially undesirable patterns when training on gender
biased text. In this work, we propose a general framework that decomposes
gender bias in text along several pragmatic and semantic dimensions: bias from
the gender of the person being spoken about, bias from the gender of the person
being spoken to, and bias from the gender of the speaker. Using this
fine-grained framework, we automatically annotate eight large scale datasets
with gender information. In addition, we collect a novel, crowdsourced
evaluation benchmark of utterance-level gender rewrites. Distinguishing between
gender bias along multiple dimensions is important, as it enables us to train
finer-grained gender bias classifiers. We show our classifiers prove valuable
for a variety of important applications, such as controlling for gender bias in
generative models, detecting gender bias in arbitrary text, and shed light on
offensive language in terms of genderedness.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Are Models Biased on Text without Gender-related Language? [14.931375031931386]
We introduce UnStereoEval (USE), a novel framework for investigating gender bias in stereotype-free scenarios.
USE defines a sentence-level score based on pretraining data statistics to determine if the sentence contain minimal word-gender associations.
We find low fairness across all 28 tested models, suggesting that bias does not solely stem from the presence of gender-related words.
arXiv Detail & Related papers (2024-05-01T15:51:15Z) - Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation [0.0]
We use Hindi as the source language and construct two sets of gender-specific sentences to evaluate different Hindi-English (HI-EN) NMT systems.
Our work highlights the importance of considering the nature of language when designing such extrinsic bias evaluation datasets.
arXiv Detail & Related papers (2023-11-07T07:09:59Z) - ''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT
Generated English Text [11.085070600065801]
Language serves as a powerful tool for the manifestation of societal belief systems.
Gender bias is one of the most pervasive biases in our society.
We create the first dataset of GPT-generated English text with normative ratings of gender bias.
arXiv Detail & Related papers (2023-10-26T14:34:06Z) - How To Build Competitive Multi-gender Speech Translation Models For
Controlling Speaker Gender Translation [21.125217707038356]
When translating from notional gender languages into grammatical gender languages, the generated translation requires explicit gender assignments for various words, including those referring to the speaker.
To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender.
This paper aims to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain.
arXiv Detail & Related papers (2023-10-23T17:21:32Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Gender Bias in Text: Labeled Datasets and Lexicons [0.30458514384586394]
There is a lack of gender bias datasets and lexicons for automating the detection of gender bias.
We provide labeled datasets and exhaustive lexicons by collecting, annotating, and augmenting relevant sentences.
The released datasets and lexicons span multiple bias subtypes including: Generic He, Generic She, Explicit Marking of Sex, and Gendered Neologisms.
arXiv Detail & Related papers (2022-01-21T12:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.