Related papers: Adapting Fairness Interventions to Missing Values

Adapting Fairness Interventions to Missing Values

URL: http://arxiv.org/abs/2305.19429v2
Date: Fri, 10 Nov 2023 07:15:29 GMT
Title: Adapting Fairness Interventions to Missing Values
Authors: Raymond Feng, Flavio P. Calmon, Hao Wang
Abstract summary: Missing values in real-world data pose a significant and unique challenge to algorithmic fairness. Standard procedure for handling missing values where first data is imputed, then the imputed data is used for classification can exacerbate discrimination. We present scalable and adaptive algorithms for fair classification with missing values.
Score: 4.820576346277399
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Missing values in real-world data pose a significant and unique challenge to algorithmic fairness. Different demographic groups may be unequally affected by missing data, and the standard procedure for handling missing values where first data is imputed, then the imputed data is used for classification -- a procedure referred to as "impute-then-classify" -- can exacerbate discrimination. In this paper, we analyze how missing values affect algorithmic fairness. We first prove that training a classifier from imputed data can significantly worsen the achievable values of group fairness and average accuracy. This is because imputing data results in the loss of the missing pattern of the data, which often conveys information about the predictive label. We present scalable and adaptive algorithms for fair classification with missing values. These algorithms can be combined with any preexisting fairness-intervention algorithm to handle all possible missing patterns while preserving information encoded within the missing patterns. Numerical experiments with state-of-the-art fairness interventions demonstrate that our adaptive algorithms consistently achieve higher fairness and accuracy than impute-then-classify across different datasets.

Related papers

The influence of missing data mechanisms and simple missing data handling techniques on fairness [0.0]
We study how missing values and the handling thereof can impact the fairness of an algorithm. The starting point of the study is the mechanism of missingness, leading into how the missing data are processed. The results show that under certain scenarios the impact on fairness can be pronounced when the missingness mechanism is missing at random.
arXiv Detail & Related papers (2025-03-10T13:32:25Z)
Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself. We derive estimators demographic parity, equal opportunity, and conditional mutual information. To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z)
Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy. The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes. We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z)
Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias. We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates. We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Understanding Unfairness in Fraud Detection through Model and Data Bias Interactions [4.159343412286401]
We argue that algorithmic unfairness stems from interactions between models and biases in the data. We study a set of hypotheses regarding the fairness-accuracy trade-offs that fairness-blind ML algorithms exhibit under different data bias settings.
arXiv Detail & Related papers (2022-07-13T15:18:30Z)
Classification of datasets with imputed missing values: does imputation quality matter? [2.7646249774183]
Classifying samples in incomplete datasets is non-trivial. We demonstrate how the commonly used measures for assessing quality are flawed. We propose a new class of discrepancy scores which focus on how well the method recreates the overall distribution of the data.
arXiv Detail & Related papers (2022-06-16T22:58:03Z)
Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group. We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z)
Greedy structure learning from data that contains systematic missing values [13.088541054366527]
Learning from data that contain missing values represents a common phenomenon in many domains. Relatively few Bayesian Network structure learning algorithms account for missing data. This paper describes three variants of greedy search structure learning that utilise pairwise deletion and inverse probability weighting.
arXiv Detail & Related papers (2021-07-09T02:56:44Z)
Self-Trained One-class Classification for Unsupervised Anomaly Detection [56.35424872736276]
Anomaly detection (AD) has various applications across domains, from manufacturing to healthcare. In this work, we focus on unsupervised AD problems whose entire training data are unlabeled and may contain both normal and anomalous samples. To tackle this problem, we build a robust one-class classification framework via data refinement. We show that our method outperforms state-of-the-art one-class classification method by 6.3 AUC and 12.5 average precision.
arXiv Detail & Related papers (2021-06-11T01:36:08Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning. We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
Fast Fair Regression via Efficient Approximations of Mutual Information [0.0]
This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models. It uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework. Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs.
arXiv Detail & Related papers (2020-02-14T08:50:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.