Related papers: Controlled Analyses of Social Biases in Wikipedia Bios

Controlled Analyses of Social Biases in Wikipedia Bios

URL: http://arxiv.org/abs/2101.00078v1
Date: Thu, 31 Dec 2020 21:27:12 GMT
Title: Controlled Analyses of Social Biases in Wikipedia Bios
Authors: Anjalie Field, Chan Young Park, Yulia Tsvetkov
Abstract summary: We present a methodology for reducing the effects of confounding variables in analyses of Wikipedia biography pages. We evaluate our methodology by developing metrics to measure how well the comparison corpus aligns with the target corpus. Our results show that failing to control for confounding variables can result in different conclusions and mask biases.
Score: 27.591896251854724
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Social biases on Wikipedia, a widely-read global platform, could greatly influence public opinion. While prior research has examined man/woman gender bias in biography articles, possible influences of confounding variables limit conclusions. In this work, we present a methodology for reducing the effects of confounding variables in analyses of Wikipedia biography pages. Given a target corpus for analysis (e.g. biography pages about women), we present a method for constructing a comparison corpus that matches the target corpus in as many attributes as possible, except the target attribute (e.g. the gender of the subject). We evaluate our methodology by developing metrics to measure how well the comparison corpus aligns with the target corpus. We then examine how articles about gender and racial minorities (cisgender women, non-binary people, transgender women, and transgender men; African American, Asian American, and Hispanic/Latinx American people) differ from other articles, including analyses driven by social theories like intersectionality. In addition to identifying suspect social biases, our results show that failing to control for confounding variables can result in different conclusions and mask biases. Our contributions include methodology that facilitates further analyses of bias in Wikipedia articles, findings that can aid Wikipedia editors in reducing biases, and framework and evaluation metrics to guide future work in this area.

Related papers

Survival of the Notable: Gender Asymmetry in Wikipedia Collective Deliberations [0.0]
Articles for Deletion (AfD) discussions on Wikipedia allow editors to gauge the notability of existing articles. We find that biographies of women are nominated for deletion faster than those of men, despite editors taking longer to reach a consensus for deletion of women. We find that AfDs about historical figures show a strong tendency to result into the redirecting or merging of the biography under discussion into other encyclopedic entries.
arXiv Detail & Related papers (2024-11-07T00:37:24Z)
Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia [49.80565462746646]
We introduce the InfoGap method -- an efficient and reliable approach to locating information gaps and inconsistencies in articles at the fact level. We evaluate InfoGap by analyzing LGBT people's portrayals, across 2.7K biography pages on English, Russian, and French Wikipedias.
arXiv Detail & Related papers (2024-10-05T20:40:49Z)
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets. Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z)
Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks. We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations. We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z)
Choose Your Lenses: Flaws in Gender Bias Evaluation [29.16221451643288]
We assess the current paradigm of gender bias evaluation and identify several flaws in it. First, we highlight the importance of extrinsic bias metrics that measure how a model's performance on some task is affected by gender. Second, we find that datasets and metrics are often coupled, and discuss how their coupling hinders the ability to obtain reliable conclusions.
arXiv Detail & Related papers (2022-10-20T17:59:55Z)
Social Biases in Automatic Evaluation Metrics for NLG [53.76118154594404]
We propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics. We construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks.
arXiv Detail & Related papers (2022-10-17T08:55:26Z)
Theories of "Gender" in NLP Bias Research [0.0]
We survey nearly 200 articles concerning gender bias in NLP. We find that the majority of the articles do not make their theorization of gender explicit. Many conflate sex characteristics, social gender, and linguistic gender in ways that disregard the existence and experience of trans, nonbinary, and intersex people.
arXiv Detail & Related papers (2022-05-05T09:20:53Z)
Gender bias in magazines oriented to men and women: a computational approach [58.720142291102135]
We compare the content of a women-oriented magazine with that of a men-oriented one, both produced by the same editorial group over a decade. With Topic Modelling techniques we identify the main themes discussed in the magazines and quantify how much the presence of these topics differs between magazines over time. Our results show that the frequency of appearance of the topics Family, Business and Women as sex objects, present an initial bias that tends to disappear over time.
arXiv Detail & Related papers (2020-11-24T14:02:49Z)
Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia [34.183132688084534]
Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions. We show how word connotations differ across languages and cultures, highlighting the difficulty of generalizing existing English datasets. We then demonstrate the usefulness of our method by analyzing Wikipedia biography pages of members of the LGBT community across three languages.
arXiv Detail & Related papers (2020-10-21T08:27:36Z)
Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes. GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.