Estimating Gender Completeness in Wikipedia
- URL: http://arxiv.org/abs/2401.08993v1
- Date: Wed, 17 Jan 2024 06:03:31 GMT
- Title: Estimating Gender Completeness in Wikipedia
- Authors: Hrishikesh Patel, Tianwa Chen, Ivano Bongiovanni, Gianluca Demartini
- Abstract summary: The aim of this paper is to provide the Wikipedia community with instruments to estimate the magnitude of the problem for different entity types.
Our results show not only which gender for different sub-classes of Person is more prevalent in Wikipedia, but also an idea of how complete the coverage is for difference genders and sub-classes of Person.
- Score: 4.292453466361998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gender imbalance in Wikipedia content is a known challenge which the editor
community is actively addressing. The aim of this paper is to provide the
Wikipedia community with instruments to estimate the magnitude of the problem
for different entity types (also known as classes) in Wikipedia. To this end,
we apply class completeness estimation methods based on the gender attribute.
Our results show not only which gender for different sub-classes of Person is
more prevalent in Wikipedia, but also an idea of how complete the coverage is
for difference genders and sub-classes of Person.
Related papers
- Survival of the Notable: Gender Asymmetry in Wikipedia Collective Deliberations [0.0]
Articles for Deletion (AfD) discussions on Wikipedia allow editors to gauge the notability of existing articles.
We find that biographies of women are nominated for deletion faster than those of men, despite editors taking longer to reach a consensus for deletion of women.
We find that AfDs about historical figures show a strong tendency to result into the redirecting or merging of the biography under discussion into other encyclopedic entries.
arXiv Detail & Related papers (2024-11-07T00:37:24Z) - The Causal Influence of Grammatical Gender on Distributional Semantics [87.8027818528463]
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science.
We offer a novel, causal graphical model that jointly represents the interactions between a noun's grammatical gender, its meaning, and adjective choice.
When we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
arXiv Detail & Related papers (2023-11-30T13:58:13Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - Wikigender: A Machine Learning Model to Detect Gender Bias in Wikipedia [0.0]
We use a machine learning model to prove that there is a difference in how women and men are portrayed on Wikipedia.
Using only adjectives as input to the model, we show that the adjectives used to portray women have a higher subjectivity than the ones used to describe men.
arXiv Detail & Related papers (2022-11-14T16:49:09Z) - Gendered Language in Resumes and its Implications for Algorithmic Bias
in Hiring [0.0]
We train a series of models to classify the gender of the applicant.
We investigate whether it is possible to obfuscate gender from resumes.
We find that there is a significant amount of gendered information in resumes even after obfuscation.
arXiv Detail & Related papers (2021-12-16T14:26:36Z) - Controlled Analyses of Social Biases in Wikipedia Bios [27.591896251854724]
We present a methodology for reducing the effects of confounding variables in analyses of Wikipedia biography pages.
We evaluate our methodology by developing metrics to measure how well the comparison corpus aligns with the target corpus.
Our results show that failing to control for confounding variables can result in different conclusions and mask biases.
arXiv Detail & Related papers (2020-12-31T21:27:12Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Multiple Texts as a Limiting Factor in Online Learning: Quantifying
(Dis-)similarities of Knowledge Networks across Languages [60.00219873112454]
We investigate the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted.
Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias.
The article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
arXiv Detail & Related papers (2020-08-05T11:11:55Z) - Global gender differences in Wikipedia readership [14.112831377937107]
We present novel evidence of gender differences in Wikipedia readership and how they manifest in records of user behavior.
More specifically we report that (1) women are underrepresented among readers of Wikipedia, (2) women view fewer pages per reading session than men do, (3) men and women visit Wikipedia for similar reasons, and (4) men and women exhibit specific topical preferences.
arXiv Detail & Related papers (2020-07-20T18:40:32Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.