Fairness Testing through Extreme Value Theory
- URL: http://arxiv.org/abs/2501.11597v1
- Date: Mon, 20 Jan 2025 16:56:10 GMT
- Title: Fairness Testing through Extreme Value Theory
- Authors: Verya Monjezi, Ashutosh Trivedi, Vladik Kreinovich, Saeid Tizpaz-Niari,
- Abstract summary: We propose a novel fairness criterion called extreme counterfactual discrimination (ECD)
ECD estimates the worst-case amounts of disadvantage in outcomes for individuals solely based on their memberships in a protected group.
We present a randomized algorithm that samples a statistically significant set of points from the tail of ML outcome distributions.
- Score: 7.980954450762595
- License:
- Abstract: Data-driven software is increasingly being used as a critical component of automated decision-support systems. Since this class of software learns its logic from historical data, it can encode or amplify discriminatory practices. Previous research on algorithmic fairness has focused on improving average-case fairness. On the other hand, fairness at the extreme ends of the spectrum, which often signifies lasting and impactful shifts in societal attitudes, has received significantly less emphasis. Leveraging the statistics of extreme value theory (EVT), we propose a novel fairness criterion called extreme counterfactual discrimination (ECD). This criterion estimates the worst-case amounts of disadvantage in outcomes for individuals solely based on their memberships in a protected group. Utilizing tools from search-based software engineering and generative AI, we present a randomized algorithm that samples a statistically significant set of points from the tail of ML outcome distributions even if the input dataset lacks a sufficient number of relevant samples. We conducted several experiments on four ML models (deep neural networks, logistic regression, and random forests) over 10 socially relevant tasks from the literature on algorithmic fairness. First, we evaluate the generative AI methods and find that they generate sufficient samples to infer valid EVT distribution in 95% of cases. Remarkably, we found that the prevalent bias mitigators reduce the average-case discrimination but increase the worst-case discrimination significantly in 5% of cases. We also observed that even the tail-aware mitigation algorithm -- MiniMax-Fairness -- increased the worst-case discrimination in 30% of cases. We propose a novel ECD-based mitigator that improves fairness in the tail in 90% of cases with no degradation of the average-case discrimination.
Related papers
- Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself.
We derive estimators demographic parity, equal opportunity, and conditional mutual information.
To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions [13.279926364884512]
Machine learning models can underperform on certain population groups due to choices made during model development and bias inherent in the data.
We quantify aleatoric discrimination by determining the performance limits of a model under fairness constraints.
We quantify epistemic discrimination as the gap between a model's accuracy when fairness constraints are applied and the limit posed by aleatoric discrimination.
arXiv Detail & Related papers (2023-01-27T15:38:20Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Fairness-Aware Naive Bayes Classifier for Data with Multiple Sensitive
Features [0.0]
We generalise two-naive-Bayes (2NB) into N-naive-Bayes (NNB) to eliminate the simplification of assuming only two sensitive groups in the data.
We investigate its application on data with multiple sensitive features and propose a new constraint and post-processing routine to enforce differential fairness.
arXiv Detail & Related papers (2022-02-23T13:32:21Z) - Normalise for Fairness: A Simple Normalisation Technique for Fairness in Regression Machine Learning Problems [46.93320580613236]
We present a simple, yet effective method based on normalisation (FaiReg) for regression problems.
We compare it with two standard methods for fairness, namely data balancing and adversarial training.
The results show the superior performance of diminishing the effects of unfairness better than data balancing.
arXiv Detail & Related papers (2022-02-02T12:26:25Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Data Augmentation Imbalance For Imbalanced Attribute Classification [60.71438625139922]
We propose a new re-sampling algorithm called: data augmentation imbalance (DAI) to explicitly enhance the ability to discriminate the fewer attributes.
Our DAI algorithm achieves state-of-the-art results, based on pedestrian attribute datasets.
arXiv Detail & Related papers (2020-04-19T20:43:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.