Social Biases in Knowledge Representations of Wikidata separates Global North from Global South
- URL: http://arxiv.org/abs/2505.02352v1
- Date: Mon, 05 May 2025 04:21:12 GMT
- Title: Social Biases in Knowledge Representations of Wikidata separates Global North from Global South
- Authors: Paramita Das, Sai Keerthana Karnam, Aditya Soni, Animesh Mukherjee,
- Abstract summary: Link prediction (LP) is a crucial downstream task for knowledge graphs, as it helps to address the problem of the incompleteness of the knowledge graphs.<n>Previous research has shown that knowledge graphs, often created in a (semi) automatic manner, are not free from social biases.<n>These biases can have harmful effects on downstream applications, especially by leading to unfair behavior toward minority groups.
- Score: 4.427603894929721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Graphs have become increasingly popular due to their wide usage in various downstream applications, including information retrieval, chatbot development, language model construction, and many others. Link prediction (LP) is a crucial downstream task for knowledge graphs, as it helps to address the problem of the incompleteness of the knowledge graphs. However, previous research has shown that knowledge graphs, often created in a (semi) automatic manner, are not free from social biases. These biases can have harmful effects on downstream applications, especially by leading to unfair behavior toward minority groups. To understand this issue in detail, we develop a framework -- AuditLP -- deploying fairness metrics to identify biased outcomes in LP, specifically how occupations are classified as either male or female-dominated based on gender as a sensitive attribute. We have experimented with the sensitive attribute of age and observed that occupations are categorized as young-biased, old-biased, and age-neutral. We conduct our experiments on a large number of knowledge triples that belong to 21 different geographies extracted from the open-sourced knowledge graph, Wikidata. Our study shows that the variance in the biased outcomes across geographies neatly mirrors the socio-economic and cultural division of the world, resulting in a transparent partition of the Global North from the Global South.
Related papers
- Data Bias in Human Mobility is a Universal Phenomenon but is Highly Location-specific [0.0]
We study data production', quantifying not only whether individuals are represented in big digital datasets, but also how they are represented in terms of how much data they produce.<n>We study GPS mobility data collected from anonymized smartphones for ten major US cities and find that data points can be more unequally distributed between users than wealth.<n>We build models to predict the number of data points we can expect to be produced by the composition of demographic groups living in census tracts, and find strong effects of wealth, ethnicity, and education on data production.
arXiv Detail & Related papers (2025-07-31T20:19:50Z) - Detecting and Mitigating Bias in LLMs through Knowledge Graph-Augmented Training [2.8402080392117757]
This work investigates Knowledge Graph-Augmented Training (KGAT) as a novel method to mitigate bias in large language models.<n>Public datasets for bias assessment include Gender Shades, Bias in Bios, and FairFace.<n>We also performed targeted mitigation strategies to correct biased associations, leading to a significant drop in biased output and improved bias metrics.
arXiv Detail & Related papers (2025-04-01T00:27:50Z) - Who Does the Giant Number Pile Like Best: Analyzing Fairness in Hiring Contexts [5.111540255111445]
Race-based differences appear in approximately 10% of generated summaries, while gender-based differences occur in only 1%.<n>Retrieval models demonstrate comparable sensitivity to non-demographic changes, suggesting that fairness issues may stem from general brittleness issues.
arXiv Detail & Related papers (2025-01-08T07:28:10Z) - Fairness in AI Systems: Mitigating gender bias from language-vision
models [0.913755431537592]
We study the extent of the impact of gender bias in existing datasets.
We propose a methodology to mitigate its impact in caption based language vision models.
arXiv Detail & Related papers (2023-05-03T04:33:44Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Diversity matters: Robustness of bias measurements in Wikidata [4.950095974653716]
We reveal data biases that surface in Wikidata for thirteen different demographics selected from seven continents.
We conduct our extensive experiments on a large number of occupations sampled from the thirteen demographics with respect to the sensitive attribute, i.e., gender.
We show that the choice of the state-of-the-art KG embedding algorithm has a strong impact on the ranking of biased occupations irrespective of gender.
arXiv Detail & Related papers (2023-02-27T18:38:10Z) - Mitigating Relational Bias on Knowledge Graphs [51.346018842327865]
We propose Fair-KGNN, a framework that simultaneously alleviates multi-hop bias and preserves the proximity information of entity-to-relation in knowledge graphs.
We develop two instances of Fair-KGNN incorporating with two state-of-the-art KGNN models, RGCN and CompGCN, to mitigate gender-occupation and nationality-salary bias.
arXiv Detail & Related papers (2022-11-26T05:55:34Z) - Gender Stereotyping Impact in Facial Expression Recognition [1.5340540198612824]
In recent years, machine learning-based models have become the most popular approach to Facial Expression Recognition (FER)
In publicly available FER datasets, apparent gender representation is usually mostly balanced, but their representation in the individual label is not.
We generate derivative datasets with different amounts of stereotypical bias by altering the gender proportions of certain labels.
We observe a discrepancy in the recognition of certain emotions between genders of up to $29 %$ under the worst bias conditions.
arXiv Detail & Related papers (2022-10-11T10:52:23Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Unbiased Graph Embedding with Biased Graph Observations [52.82841737832561]
We propose a principled new way for obtaining unbiased representations by learning from an underlying bias-free graph.
Based on this new perspective, we propose two complementary methods for uncovering such an underlying graph.
arXiv Detail & Related papers (2021-10-26T18:44:37Z) - Towards Measuring Bias in Image Classification [61.802949761385]
Convolutional Neural Networks (CNN) have become state-of-the-art for the main computer vision tasks.
However, due to the complex structure their decisions are hard to understand which limits their use in some context of the industrial world.
We present a systematic approach to uncover data bias by means of attribution maps.
arXiv Detail & Related papers (2021-07-01T10:50:39Z) - REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset.
It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.