Group Fairness by Probabilistic Modeling with Latent Fair Decisions
- URL: http://arxiv.org/abs/2009.09031v2
- Date: Thu, 17 Dec 2020 00:28:37 GMT
- Title: Group Fairness by Probabilistic Modeling with Latent Fair Decisions
- Authors: YooJung Choi, Meihua Dang, Guy Van den Broeck
- Abstract summary: This paper studies learning fair probability distributions from biased data by explicitly modeling a latent variable that represents a hidden, unbiased label.
We aim to achieve demographic parity by enforcing certain independencies in the learned model.
We also show that group fairness guarantees are meaningful only if the distribution used to provide those guarantees indeed captures the real-world data.
- Score: 36.20281545470954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning systems are increasingly being used to make impactful
decisions such as loan applications and criminal justice risk assessments, and
as such, ensuring fairness of these systems is critical. This is often
challenging as the labels in the data are biased. This paper studies learning
fair probability distributions from biased data by explicitly modeling a latent
variable that represents a hidden, unbiased label. In particular, we aim to
achieve demographic parity by enforcing certain independencies in the learned
model. We also show that group fairness guarantees are meaningful only if the
distribution used to provide those guarantees indeed captures the real-world
data. In order to closely model the data distribution, we employ probabilistic
circuits, an expressive and tractable probabilistic model, and propose an
algorithm to learn them from incomplete data. We evaluate our approach on a
synthetic dataset in which observed labels indeed come from fair labels but
with added bias, and demonstrate that the fair labels are successfully
retrieved. Moreover, we show on real-world datasets that our approach not only
is a better model than existing methods of how the data was generated but also
achieves competitive accuracy.
Related papers
- Achievable Fairness on Your Data With Utility Guarantees [16.78730663293352]
In machine learning fairness, training models that minimize disparity across different sensitive groups often leads to diminished accuracy.
We present a computationally efficient approach to approximate the fairness-accuracy trade-off curve tailored to individual datasets.
We introduce a novel methodology for quantifying uncertainty in our estimates, thereby providing practitioners with a robust framework for auditing model fairness.
arXiv Detail & Related papers (2024-02-27T00:59:32Z) - Fair Active Learning in Low-Data Regimes [22.349886628823125]
In machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities.
In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments.
We introduce an innovative active learning framework that combines an exploration procedure inspired by posterior sampling with a fair classification subroutine.
We demonstrate that this framework performs effectively in very data-scarce regimes, maximizing accuracy while satisfying fairness constraints with high probability.
arXiv Detail & Related papers (2023-12-13T23:14:55Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Simultaneous Improvement of ML Model Fairness and Performance by
Identifying Bias in Data [1.76179873429447]
We propose a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training.
In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset.
arXiv Detail & Related papers (2022-10-24T13:04:07Z) - Impossibility results for fair representations [12.483260526189447]
We argue that no representation can guarantee the fairness of classifiers for different tasks trained using it.
More refined notions of fairness, like Odds Equality, cannot be guaranteed by a representation that does not take into account the task specific labeling rule.
arXiv Detail & Related papers (2021-07-07T21:12:55Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z) - Robust Fairness under Covariate Shift [11.151913007808927]
Making predictions that are fair with regard to protected group membership has become an important requirement for classification algorithms.
We propose an approach that obtains the predictor that is robust to the worst-case in terms of target performance.
arXiv Detail & Related papers (2020-10-11T04:42:01Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Beyond Individual and Group Fairness [90.4666341812857]
We present a new data-driven model of fairness that is guided by the unfairness complaints received by the system.
Our model supports multiple fairness criteria and takes into account their potential incompatibilities.
arXiv Detail & Related papers (2020-08-21T14:14:44Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.