On the impossibility of non-trivial accuracy under fairness constraints
- URL: http://arxiv.org/abs/2107.06944v1
- Date: Wed, 14 Jul 2021 19:15:50 GMT
- Title: On the impossibility of non-trivial accuracy under fairness constraints
- Authors: Carlos Pinz\'on, Catuscia Palamidessi, Pablo Piantanida, Frank
Valencia
- Abstract summary: One of the main concerns about fairness in machine learning (ML) is that, in order to achieve it, one may have to renounce to some accuracy.
We show that there are probabilistic data sources for which equal opportunities (EO) can only be achieved at the total detriment of accuracy.
- Score: 26.418304315499064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the main concerns about fairness in machine learning (ML) is that, in
order to achieve it, one may have to renounce to some accuracy. Having this
trade-off in mind, Hardt et al. have proposed the notion of equal opportunities
(EO), designed so as to be compatible with accuracy. In fact, it can be shown
that if the source of input data is deterministic, the two notions go well
along with each other. In the probabilistic case, however, things change.
As we show, there are probabilistic data sources for which EO can only be
achieved at the total detriment of accuracy, i.e. among the models that achieve
EO, those whose prediction does not depend on the input have the highest
accuracy.
Related papers
- Scoring Rules and Calibration for Imprecise Probabilities [7.289672463326423]
We argue that proper scoring rules and calibration serve two distinct goals, which are aligned in the precise case, but intriguingly are not necessarily aligned in the case.
We demonstrate the theoretical insights in machine learning practice, in particular we illustrate subtle pitfalls relating to the choice of loss function in distributional robustness.
arXiv Detail & Related papers (2024-10-30T13:29:47Z) - Predicting generalization performance with correctness discriminators [64.00420578048855]
We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data.
We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds.
arXiv Detail & Related papers (2023-11-15T22:43:42Z) - BaBE: Enhancing Fairness via Estimation of Latent Explaining Variables [6.7932860553262415]
We consider the problem of unfair discrimination between two groups and propose a pre-processing method to achieve fairness.
BaBE is an approach based on a combination of Bayes inference and the Expectation-Maximization method.
We show, by experiments on synthetic and real data sets, that our approach provides a good level of fairness as well as high accuracy.
arXiv Detail & Related papers (2023-07-06T09:53:56Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - VisFIS: Visual Feature Importance Supervision with
Right-for-the-Right-Reason Objectives [84.48039784446166]
We show that model FI supervision can meaningfully improve VQA model accuracy as well as performance on several Right-for-the-Right-Reason metrics.
Our best performing method, Visual Feature Importance Supervision (VisFIS), outperforms strong baselines on benchmark VQA datasets.
Predictions are more accurate when explanations are plausible and faithful, and not when they are plausible but not faithful.
arXiv Detail & Related papers (2022-06-22T17:02:01Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Emergent Unfairness in Algorithmic Fairness-Accuracy Trade-Off Research [2.6397379133308214]
We argue that such assumptions, which are often left implicit and unexamined, lead to inconsistent conclusions.
While the intended goal of this work may be to improve the fairness of machine learning models, these unexamined, implicit assumptions can in fact result in emergent unfairness.
arXiv Detail & Related papers (2021-02-01T22:02:14Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Fairness Constraints in Semi-supervised Learning [56.48626493765908]
We develop a framework for fair semi-supervised learning, which is formulated as an optimization problem.
We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition.
Our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.
arXiv Detail & Related papers (2020-09-14T04:25:59Z) - Estimation of Accurate and Calibrated Uncertainties in Deterministic
models [0.8702432681310401]
We devise a method to transform a deterministic prediction into a probabilistic one.
We show that for doing so, one has to compromise between the accuracy and the reliability (calibration) of such a model.
We show several examples both with synthetic data, where the underlying hidden noise can accurately be recovered, and with large real-world datasets.
arXiv Detail & Related papers (2020-03-11T04:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.