Mitigating Bias in Set Selection with Noisy Protected Attributes
- URL: http://arxiv.org/abs/2011.04219v2
- Date: Mon, 22 Feb 2021 17:56:05 GMT
- Title: Mitigating Bias in Set Selection with Noisy Protected Attributes
- Authors: Anay Mehrotra and L. Elisa Celis
- Abstract summary: We show that in the presence of noisy protected attributes, in attempting to increase fairness without considering noise, one can, in fact, decrease the fairness of the result!
We formulate a denoised'' selection problem which functions for a large class of fairness metrics.
Our empirical results show that this approach can produce subsets which significantly improve the fairness metrics despite the presence of noisy protected attributes.
- Score: 16.882719401742175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subset selection algorithms are ubiquitous in AI-driven applications,
including, online recruiting portals and image search engines, so it is
imperative that these tools are not discriminatory on the basis of protected
attributes such as gender or race. Currently, fair subset selection algorithms
assume that the protected attributes are known as part of the dataset. However,
protected attributes may be noisy due to errors during data collection or if
they are imputed (as is often the case in real-world settings). While a wide
body of work addresses the effect of noise on the performance of machine
learning algorithms, its effect on fairness remains largely unexamined. We find
that in the presence of noisy protected attributes, in attempting to increase
fairness without considering noise, one can, in fact, decrease the fairness of
the result!
Towards addressing this, we consider an existing noise model in which there
is probabilistic information about the protected attributes (e.g., [58, 34, 20,
46]), and ask is fair selection possible under noisy conditions? We formulate a
``denoised'' selection problem which functions for a large class of fairness
metrics; given the desired fairness goal, the solution to the denoised problem
violates the goal by at most a small multiplicative amount with high
probability. Although this denoised problem turns out to be NP-hard, we give a
linear-programming based approximation algorithm for it. We evaluate this
approach on both synthetic and real-world datasets. Our empirical results show
that this approach can produce subsets which significantly improve the fairness
metrics despite the presence of noisy protected attributes, and, compared to
prior noise-oblivious approaches, has better Pareto-tradeoffs between utility
and fairness.
Related papers
- Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - When Fair Classification Meets Noisy Protected Attributes [8.362098382773265]
This study is the first head-to-head study of fair classification algorithms to compare attribute-reliant, noise-tolerant and attribute-blind algorithms.
Our study reveals that attribute-blind and noise-tolerant fair classifiers can potentially achieve similar level of performance as attribute-reliant algorithms.
arXiv Detail & Related papers (2023-07-06T21:38:18Z) - dugMatting: Decomposed-Uncertainty-Guided Matting [83.71273621169404]
We propose a decomposed-uncertainty-guided matting algorithm, which explores the explicitly decomposed uncertainties to efficiently and effectively improve the results.
The proposed matting framework relieves the requirement for users to determine the interaction areas by using simple and efficient labeling.
arXiv Detail & Related papers (2023-06-02T11:19:50Z) - Group Fairness with Uncertainty in Sensitive Attributes [34.608332397776245]
A fair predictive model is crucial to mitigate biased decisions against minority groups in high-stakes applications.
We propose a bootstrap-based algorithm that achieves the target level of fairness despite the uncertainty in sensitive attributes.
Our algorithm is applicable to both discrete and continuous sensitive attributes and is effective in real-world classification and regression tasks.
arXiv Detail & Related papers (2023-02-16T04:33:00Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Classification with abstention but without disparities [5.025654873456756]
We build a general purpose classification algorithm, which is able to abstain from prediction, while avoiding disparate impact.
We establish finite sample risk, fairness, and abstention guarantees for the proposed algorithm.
Our method empirically shows that moderate abstention rates allow to bypass the risk-fairness trade-off.
arXiv Detail & Related papers (2021-02-24T12:43:55Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Contextual Linear Bandits under Noisy Features: Towards Bayesian Oracles [65.9694455739978]
We study contextual linear bandit problems under feature uncertainty, where the features are noisy and have missing entries.
Our analysis reveals that the optimal hypothesis can significantly deviate from the underlying realizability function, depending on the noise characteristics.
This implies that classical approaches cannot guarantee a non-trivial regret bound.
arXiv Detail & Related papers (2017-03-03T21:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.