Comparing Intrinsic Gender Bias Evaluation Measures without using Human
Annotated Examples
- URL: http://arxiv.org/abs/2301.12074v1
- Date: Sat, 28 Jan 2023 03:11:50 GMT
- Title: Comparing Intrinsic Gender Bias Evaluation Measures without using Human
Annotated Examples
- Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki
- Abstract summary: We propose a method to compare intrinsic gender bias evaluation measures without relying on human-annotated examples.
Specifically, we create bias-controlled versions of language models using varying amounts of male vs. female gendered sentences.
The rank correlation between the computed bias scores and the gender proportions used to fine-tune the PLMs is computed.
- Score: 33.044775876807826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Numerous types of social biases have been identified in pre-trained language
models (PLMs), and various intrinsic bias evaluation measures have been
proposed for quantifying those social biases. Prior works have relied on human
annotated examples to compare existing intrinsic bias evaluation measures.
However, this approach is not easily adaptable to different languages nor
amenable to large scale evaluations due to the costs and difficulties when
recruiting human annotators. To overcome this limitation, we propose a method
to compare intrinsic gender bias evaluation measures without relying on
human-annotated examples. Specifically, we create multiple bias-controlled
versions of PLMs using varying amounts of male vs. female gendered sentences,
mined automatically from an unannotated corpus using gender-related word lists.
Next, each bias-controlled PLM is evaluated using an intrinsic bias evaluation
measure, and the rank correlation between the computed bias scores and the
gender proportions used to fine-tune the PLMs is computed. Experiments on
multiple corpora and PLMs repeatedly show that the correlations reported by our
proposed method that does not require human annotated examples are comparable
to those computed using human annotated examples in prior work.
Related papers
- Measuring Social Biases in Masked Language Models by Proxy of Prediction
Quality [0.0]
Social political scientists often aim to discover and measure distinct biases from text data representations (embeddings)
In this paper, we evaluate the social biases encoded by transformers trained with a masked language modeling objective.
We find that proposed measures produce more accurate estimations of relative preference for biased sentences between transformers than others based on our methods.
arXiv Detail & Related papers (2024-02-21T17:33:13Z) - Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)
We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.
Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - Social Biases in Automatic Evaluation Metrics for NLG [53.76118154594404]
We propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics.
We construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks.
arXiv Detail & Related papers (2022-10-17T08:55:26Z) - Adversarial Examples Generation for Reducing Implicit Gender Bias in
Pre-trained Models [2.6329024988388925]
We propose a method to automatically generate implicit gender bias samples at sentence-level and a metric to measure gender bias.
The metric will be used to guide the generation of examples from Pre-trained models. Therefore, those examples could be used to impose attacks on Pre-trained Models.
arXiv Detail & Related papers (2021-10-03T20:22:54Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.