Related papers: On Fairness and Stability: Is Estimator Variance a Friend or a Foe?

On Fairness and Stability: Is Estimator Variance a Friend or a Foe?

URL: http://arxiv.org/abs/2302.04525v1
Date: Thu, 9 Feb 2023 09:35:36 GMT
Title: On Fairness and Stability: Is Estimator Variance a Friend or a Foe?
Authors: Falaah Arif Khan, Denys Herasymuk, Julia Stoyanovich
Abstract summary: We propose a new family of performance measures based on group-wise parity in variance. We develop and release an open-source library that reconciles uncertainty quantification techniques with fairness analysis.
Score: 6.751310968561177
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The error of an estimator can be decomposed into a (statistical) bias term, a variance term, and an irreducible noise term. When we do bias analysis, formally we are asking the question: "how good are the predictions?" The role of bias in the error decomposition is clear: if we trust the labels/targets, then we would want the estimator to have as low bias as possible, in order to minimize error. Fair machine learning is concerned with the question: "Are the predictions equally good for different demographic/social groups?" This has naturally led to a variety of fairness metrics that compare some measure of statistical bias on subsets corresponding to socially privileged and socially disadvantaged groups. In this paper we propose a new family of performance measures based on group-wise parity in variance. We demonstrate when group-wise statistical bias analysis gives an incomplete picture, and what group-wise variance analysis can tell us in settings that differ in the magnitude of statistical bias. We develop and release an open-source library that reconciles uncertainty quantification techniques with fairness analysis, and use it to conduct an extensive empirical analysis of our variance-based fairness measures on standard benchmarks.

Related papers

Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching [0.49157446832511503]
We investigate whether the way training and testing data are sampled affects the reliability of fairness metrics. Since training and test sets are often randomly sampled from the same population, bias present in the training data may still exist in the test data. We propose FairMatch, a post-processing method that applies propensity score matching to evaluate and mitigate bias.
arXiv Detail & Related papers (2025-04-23T19:28:30Z)
Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint. We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b. We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z)
It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models [51.66015254740692]
We show that for an ensemble of deep learning based classification models, bias and variance are emphaligned at a sample level. We study this phenomenon from two theoretical perspectives: calibration and neural collapse.
arXiv Detail & Related papers (2023-10-13T17:06:34Z)
Arbitrariness and Social Prediction: The Confounding Role of Variance in Fair Classification [31.392067805022414]
Variance in predictions across different trained models is a significant, under-explored source of error in fair binary classification. In practice, the variance on some data examples is so large that decisions can be effectively arbitrary. We develop an ensembling algorithm that abstains from classification when a prediction would be arbitrary.
arXiv Detail & Related papers (2023-01-27T06:52:04Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group. We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z)
Model Mis-specification and Algorithmic Bias [0.0]
Machine learning algorithms are increasingly used to inform critical decisions. There is a growing concern about bias, that algorithms may produce uneven outcomes for individuals in different demographic groups. In this work, we measure bias as the difference between mean prediction errors across groups.
arXiv Detail & Related papers (2021-05-31T17:45:12Z)
Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter. In machine learning, these have a know-it-when-you-see-it character. We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z)
The statistical advantage of automatic NLG metrics at the system level [10.540821585237222]
Statistically, humans are unbiased, high variance estimators, while metrics are biased, low variance estimators. We compare these estimators by their error in pairwise prediction (which generation system is better?) using the bootstrap. Our analysis compares the adjusted error of metrics to humans and a derived, perfect segment-level annotator, both of which are unbiased estimators dependent on the number of judgments collected.
arXiv Detail & Related papers (2021-05-26T09:53:57Z)
Individual Calibration with Randomized Forecasting [116.2086707626651]
We show that calibration for individual samples is possible in the regression setup if the predictions are randomized. We design a training objective to enforce individual calibration and use it to train randomized regression functions.
arXiv Detail & Related papers (2020-06-18T05:53:10Z)
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds [21.598196899084268]
We use Bernstein bounds to represent uncertainty about the bias estimate as a confidence interval. We provide empirical evidence that a 95% confidence interval consistently bounds the true bias. Our findings suggest that the datasets currently used to measure bias are too small to conclusively identify bias except in the most egregious cases.
arXiv Detail & Related papers (2020-04-26T09:45:45Z)
Recovering from Biased Data: Can Fairness Constraints Improve Accuracy? [11.435833538081557]
Empirical Risk Minimization (ERM) may produce a classifier that not only is biased but also has suboptimal accuracy on the true data distribution. We examine the ability of fairness-constrained ERM to correct this problem. We also consider other recovery methods including reweighting the training data, Equalized Odds, and Demographic Parity.
arXiv Detail & Related papers (2019-12-02T22:00:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.