Related papers: Social Bias Meets Data Bias: The Impacts of Labeling and Measurement Errors on Fairness Criteria

Social Bias Meets Data Bias: The Impacts of Labeling and Measurement Errors on Fairness Criteria

URL: http://arxiv.org/abs/2206.00137v4
Date: Tue, 2 May 2023 18:10:10 GMT
Title: Social Bias Meets Data Bias: The Impacts of Labeling and Measurement Errors on Fairness Criteria
Authors: Yiqiao Liao, Parinaz Naghizadeh
Abstract summary: We consider two forms of dataset bias: errors by prior decision makers in the labeling process, and errors in measurement of the features of disadvantaged individuals. We analytically show that some constraints can remain robust when facing certain statistical biases, while others (such as Equalized Odds) are significantly violated if trained on biased data. Our findings present an additional guideline for choosing among existing fairness criteria, or for proposing new criteria, when available datasets may be biased.
Score: 4.048444203617942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although many fairness criteria have been proposed to ensure that machine learning algorithms do not exhibit or amplify our existing social biases, these algorithms are trained on datasets that can themselves be statistically biased. In this paper, we investigate the robustness of a number of existing (demographic) fairness criteria when the algorithm is trained on biased data. We consider two forms of dataset bias: errors by prior decision makers in the labeling process, and errors in measurement of the features of disadvantaged individuals. We analytically show that some constraints (such as Demographic Parity) can remain robust when facing certain statistical biases, while others (such as Equalized Odds) are significantly violated if trained on biased data. We also analyze the sensitivity of these criteria and the decision maker's utility to biases. We provide numerical experiments based on three real-world datasets (the FICO, Adult, and German credit score datasets) supporting our analytical findings. Our findings present an additional guideline for choosing among existing fairness criteria, or for proposing new criteria, when available datasets may be biased.

Related papers

Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching [0.49157446832511503]
We investigate whether the way training and testing data are sampled affects the reliability of fairness metrics. Since training and test sets are often randomly sampled from the same population, bias present in the training data may still exist in the test data. We propose FairMatch, a post-processing method that applies propensity score matching to evaluate and mitigate bias.
arXiv Detail & Related papers (2025-04-23T19:28:30Z)
The Impact of Differential Feature Under-reporting on Algorithmic Fairness [86.275300739926]
We present an analytically tractable model of differential feature under-reporting. We then use to characterize the impact of this kind of data bias on algorithmic fairness. Our results show that, in real world data settings, under-reporting typically leads to increasing disparities.
arXiv Detail & Related papers (2024-01-16T19:16:22Z)
Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem. We examine the performance of various debiasing methods across multiple tasks. We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
More Data Can Lead Us Astray: Active Data Acquisition in the Presence of Label Bias [7.506786114760462]
Proposed bias mitigation strategies typically overlook the bias presented in the observed labels. We first present an overview of different types of label bias in the context of supervised learning systems. We then empirically show that, when overlooking label bias, collecting more data can aggravate bias, and imposing fairness constraints that rely on the observed labels in the data collection process may not address the problem.
arXiv Detail & Related papers (2022-07-15T19:30:50Z)
Improving Evaluation of Debiasing in Image Classification [29.711865666774017]
Our study indicates several issues need to be improved when conducting evaluation of debiasing in image classification. Based on such issues, this paper proposes an evaluation metric Align-Conflict (AC) score' for the tuning criterion. We believe our findings and lessons inspire future researchers in debiasing to further push state-of-the-art performances with fair comparisons.
arXiv Detail & Related papers (2022-06-08T05:24:13Z)
Representation Bias in Data: A Survey on Identification and Resolution Techniques [26.142021257838564]
Data-driven algorithms are only as good as the data they work with, while data sets, especially social data, often fail to represent minorities adequately. Representation Bias in data can happen due to various reasons ranging from historical discrimination to selection and sampling biases in the data acquisition and preparation methods. This paper reviews the literature on identifying and resolving representation bias as a feature of a data set, independent of how consumed later.
arXiv Detail & Related papers (2022-03-22T16:30:22Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
Robust Fairness-aware Learning Under Sample Selection Bias [17.09665420515772]
We propose a framework for robust and fair learning under sample selection bias. We develop two algorithms to handle sample selection bias when test data is both available and unavailable.
arXiv Detail & Related papers (2021-05-24T23:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.