Measuring Social Biases of Crowd Workers using Counterfactual Queries
- URL: http://arxiv.org/abs/2004.02028v1
- Date: Sat, 4 Apr 2020 21:41:55 GMT
- Title: Measuring Social Biases of Crowd Workers using Counterfactual Queries
- Authors: Bhavya Ghai, Q. Vera Liao, Yunfeng Zhang, Klaus Mueller
- Abstract summary: Social biases based on gender, race, etc. have been shown to pollute machine learning (ML) pipeline predominantly via biased training datasets.
Crowdsourcing, a popular cost-effective measure to gather labeled training datasets, is not immune to the inherent social biases of crowd workers.
We propose a new method based on counterfactual fairness to quantify the degree of inherent social bias in each crowd worker.
- Score: 84.10721065676913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Social biases based on gender, race, etc. have been shown to pollute machine
learning (ML) pipeline predominantly via biased training datasets.
Crowdsourcing, a popular cost-effective measure to gather labeled training
datasets, is not immune to the inherent social biases of crowd workers. To
ensure such social biases aren't passed onto the curated datasets, it's
important to know how biased each crowd worker is. In this work, we propose a
new method based on counterfactual fairness to quantify the degree of inherent
social bias in each crowd worker. This extra information can be leveraged
together with individual worker responses to curate a less biased dataset.
Related papers
- Bias Propagation in Federated Learning [22.954608704251118]
We show that the bias of a few parties against under-represented groups can propagate through the network to all the parties in the network.
We analyze and explain bias propagation in federated learning on naturally partitioned real-world datasets.
arXiv Detail & Related papers (2023-09-05T11:55:03Z) - Fairness and Bias in Truth Discovery Algorithms: An Experimental
Analysis [7.575734557466221]
Crowd workers may sometimes provide unreliable labels.
Truth discovery (TD) algorithms are applied to determine the consensus labels from conflicting worker responses.
We conduct a systematic study of the bias and fairness of TD algorithms.
arXiv Detail & Related papers (2023-04-25T04:56:35Z) - BLIND: Bias Removal With No Demographics [29.16221451643288]
We introduce BLIND, a method for bias removal with no prior knowledge of the demographics in the dataset.
While training a model on a downstream task, BLIND detects biased samples using an auxiliary model that predicts the main model's success, and down-weights those samples during the training process.
Experiments with racial and gender biases in sentiment classification and occupation classification tasks demonstrate that BLIND mitigates social biases without relying on a costly demographic annotation process.
arXiv Detail & Related papers (2022-12-20T18:59:42Z) - The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
Benchmarks [75.58692290694452]
We compare social biases with non-social biases stemming from choices made during dataset construction that might not even be discernible to the human eye.
We observe that these shallow modifications have a surprising effect on the resulting degree of bias across various models.
arXiv Detail & Related papers (2022-10-18T17:58:39Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - A study on the distribution of social biases in self-supervised learning
visual models [1.8692254863855964]
Self-Supervised Learning (SSL) wrongly appears as an efficient and bias-free solution, as it does not require labelled data.
We show that there is a correlation between the type of the SSL model and the number of biases that it incorporates.
We conclude that a careful SSL model selection process can reduce the number of social biases in the deployed model.
arXiv Detail & Related papers (2022-03-03T17:03:21Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - A survey of bias in Machine Learning through the prism of Statistical
Parity for the Adult Data Set [5.277804553312449]
We show the importance of understanding how a bias can be introduced into automatic decisions.
We first present a mathematical framework for the fair learning problem, specifically in the binary classification setting.
We then propose to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set.
arXiv Detail & Related papers (2020-03-31T14:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.