Diversity matters: Robustness of bias measurements in Wikidata
- URL: http://arxiv.org/abs/2302.14027v1
- Date: Mon, 27 Feb 2023 18:38:10 GMT
- Title: Diversity matters: Robustness of bias measurements in Wikidata
- Authors: Paramita Das, Sai Keerthana Karnam, Anirban Panda, Bhanu Prakash Reddy
Guda, Soumya Sarkar, Animesh Mukherjee
- Abstract summary: We reveal data biases that surface in Wikidata for thirteen different demographics selected from seven continents.
We conduct our extensive experiments on a large number of occupations sampled from the thirteen demographics with respect to the sensitive attribute, i.e., gender.
We show that the choice of the state-of-the-art KG embedding algorithm has a strong impact on the ranking of biased occupations irrespective of gender.
- Score: 4.950095974653716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the widespread use of knowledge graphs (KG) in various automated AI
systems and applications, it is very important to ensure that information
retrieval algorithms leveraging them are free from societal biases. Previous
works have depicted biases that persist in KGs, as well as employed several
metrics for measuring the biases. However, such studies lack the systematic
exploration of the sensitivity of the bias measurements, through varying
sources of data, or the embedding algorithms used. To address this research
gap, in this work, we present a holistic analysis of bias measurement on the
knowledge graph. First, we attempt to reveal data biases that surface in
Wikidata for thirteen different demographics selected from seven continents.
Next, we attempt to unfold the variance in the detection of biases by two
different knowledge graph embedding algorithms - TransE and ComplEx. We conduct
our extensive experiments on a large number of occupations sampled from the
thirteen demographics with respect to the sensitive attribute, i.e., gender.
Our results show that the inherent data bias that persists in KG can be altered
by specific algorithm bias as incorporated by KG embedding learning algorithms.
Further, we show that the choice of the state-of-the-art KG embedding algorithm
has a strong impact on the ranking of biased occupations irrespective of
gender. We observe that the similarity of the biased occupations across
demographics is minimal which reflects the socio-cultural differences around
the globe. We believe that this full-scale audit of the bias measurement
pipeline will raise awareness among the community while deriving insights
related to design choices of data and algorithms both and refrain from the
popular dogma of ``one-size-fits-all''.
Related papers
- Outlier Detection Bias Busted: Understanding Sources of Algorithmic Bias through Data-centric Factors [28.869581543676947]
unsupervised outlier detection (OD) has numerous applications in finance, security, etc.
This work aims to shed light on the possible sources of unfairness in OD by auditing detection models under different data-centric factors.
We find that the OD algorithms under the study all exhibit fairness pitfalls, although differing in which types of data bias they are more susceptible to.
arXiv Detail & Related papers (2024-08-24T20:35:32Z) - Fairness and Bias in Truth Discovery Algorithms: An Experimental
Analysis [7.575734557466221]
Crowd workers may sometimes provide unreliable labels.
Truth discovery (TD) algorithms are applied to determine the consensus labels from conflicting worker responses.
We conduct a systematic study of the bias and fairness of TD algorithms.
arXiv Detail & Related papers (2023-04-25T04:56:35Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Assessing Demographic Bias Transfer from Dataset to Model: A Case Study
in Facial Expression Recognition [1.5340540198612824]
Two metrics focus on the representational and stereotypical bias of the dataset, and the third one on the residual bias of the trained model.
We demonstrate the usefulness of the metrics by applying them to a FER problem based on the popular Affectnet dataset.
arXiv Detail & Related papers (2022-05-20T09:40:42Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Towards Automatic Bias Detection in Knowledge Graphs [5.402498799294428]
We describe a framework for identifying biases in knowledge graph embeddings, based on numerical bias metrics.
We illustrate the framework with three different bias measures on the task of profession prediction.
The relations flagged as biased can then be handed to decision makers for judgement upon subsequent debiasing.
arXiv Detail & Related papers (2021-09-19T03:58:25Z) - Towards Measuring Bias in Image Classification [61.802949761385]
Convolutional Neural Networks (CNN) have become state-of-the-art for the main computer vision tasks.
However, due to the complex structure their decisions are hard to understand which limits their use in some context of the industrial world.
We present a systematic approach to uncover data bias by means of attribution maps.
arXiv Detail & Related papers (2021-07-01T10:50:39Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Towards causal benchmarking of bias in face analysis algorithms [54.19499274513654]
We develop an experimental method for measuring algorithmic bias of face analysis algorithms.
Our proposed method is based on generating synthetic transects'' of matched sample images.
We validate our method by comparing it to a study that employs the traditional observational method for analyzing bias in gender classification algorithms.
arXiv Detail & Related papers (2020-07-13T17:10:34Z) - Adversarial Learning for Debiasing Knowledge Graph Embeddings [9.53284633479507]
Social and cultural biases can have detrimental consequences on different population and minority groups.
This paper aims at identifying and mitigating such biases in Knowledge Graph (KG) embeddings.
We introduce a novel framework to filter out the sensitive attribute information from the KG embeddings, which we call FAN (Filtering Adversarial Network)
arXiv Detail & Related papers (2020-06-29T18:36:15Z) - REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset.
It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.