Related papers: Achievable Fairness on Your Data With Utility Guarantees

Achievable Fairness on Your Data With Utility Guarantees

URL: http://arxiv.org/abs/2402.17106v4
Date: Sat, 09 Nov 2024 15:34:31 GMT
Title: Achievable Fairness on Your Data With Utility Guarantees
Authors: Muhammad Faaiz Taufiq, Jean-Francois Ton, Yang Liu,
Abstract summary: In machine learning fairness, training models that minimize disparity across different sensitive groups often leads to diminished accuracy. We present a computationally efficient approach to approximate the fairness-accuracy trade-off curve tailored to individual datasets. We introduce a novel methodology for quantifying uncertainty in our estimates, thereby providing practitioners with a robust framework for auditing model fairness.
Score: 16.78730663293352
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In machine learning fairness, training models that minimize disparity across different sensitive groups often leads to diminished accuracy, a phenomenon known as the fairness-accuracy trade-off. The severity of this trade-off inherently depends on dataset characteristics such as dataset imbalances or biases and therefore, using a uniform fairness requirement across diverse datasets remains questionable. To address this, we present a computationally efficient approach to approximate the fairness-accuracy trade-off curve tailored to individual datasets, backed by rigorous statistical guarantees. By utilizing the You-Only-Train-Once (YOTO) framework, our approach mitigates the computational burden of having to train multiple models when approximating the trade-off curve. Crucially, we introduce a novel methodology for quantifying uncertainty in our estimates, thereby providing practitioners with a robust framework for auditing model fairness while avoiding false conclusions due to estimation errors. Our experiments spanning tabular (e.g., Adult), image (CelebA), and language (Jigsaw) datasets underscore that our approach not only reliably quantifies the optimum achievable trade-offs across various data modalities but also helps detect suboptimality in SOTA fairness methods.

Related papers

Fair Bayesian Data Selection via Generalized Discrepancy Measures [11.013077130984973]
We propose a data selection framework that ensures fairness by aligning group-specific posterior distributions of model parameters and sample weights with a shared central distribution.<n>Our framework supports flexible alignment via various distributional discrepancy measures, including Wasserstein distance, maximum mean discrepancy, and $f$-divergence.<n> Experiments on benchmark datasets show that our method consistently outperforms existing data selection and model-based fairness methods in both fairness and accuracy.
arXiv Detail & Related papers (2025-11-10T12:28:04Z)
The Statistical Fairness-Accuracy Frontier [50.323024516295725]
Machine learning models must balance accuracy and fairness, but these goals often conflict.<n>A useful tool for understanding this trade-off is the fairness-accuracy frontier, which characterizes the set of models that cannot be simultaneously improved in both fairness and accuracy.<n>We study the FA frontier in the finite-sample regime, showing how it deviates from its population counterpart and quantifying the worst-case gap between them.
arXiv Detail & Related papers (2025-08-25T03:01:35Z)
Fairness Regularization in Federated Learning [1.4773243280881763]
Federated Learning (FL) has emerged as a vital paradigm in modern machine learning.<n>This work focuses on performance equitable fairness, which aims to minimize differences in performance across clients.<n>We empirically show that FairGrad (approximate) and FairGrad* (exact) improve both fairness and overall model performance in heterogeneous data settings.
arXiv Detail & Related papers (2025-08-16T13:32:41Z)
Targeted Learning for Data Fairness [52.59573714151884]
We expand fairness inference by evaluating fairness in the data generating process itself. We derive estimators demographic parity, equal opportunity, and conditional mutual information. To validate our approach, we perform several simulations and apply our estimators to real data.
arXiv Detail & Related papers (2025-02-06T18:51:28Z)
Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise. We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z)
A Conformal Approach to Feature-based Newsvendor under Model Misspecification [2.801095519296785]
We propose a model-free and distribution-free framework inspired by conformal prediction. We validate our framework using both simulated data and a real-world dataset from the Capital Bikeshare program in Washington, D.C.
arXiv Detail & Related papers (2024-12-17T18:34:43Z)
Navigating Towards Fairness with Data Selection [27.731128352096555]
We introduce a data selection method designed to efficiently and flexibly mitigate label bias. Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set. Our modality-agnostic method has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.
arXiv Detail & Related papers (2024-12-15T06:11:05Z)
Enhancing Fairness in Neural Networks Using FairVIC [0.0]
Mitigating bias in automated decision-making systems, specifically deep learning models, is a critical challenge in achieving fairness. We introduce FairVIC, an innovative approach designed to enhance fairness in neural networks by addressing inherent biases at the training stage. We observe a significant improvement in fairness across all metrics tested, without compromising the model's accuracy to a detrimental extent.
arXiv Detail & Related papers (2024-04-28T10:10:21Z)
Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z)
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation. We then analyze the sufficient conditions to guarantee fairness for the target dataset. Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z)
Learning Antidote Data to Individual Unfairness [23.119278763970037]
Individual fairness is a vital notion to describe fair treatment for individual cases. Previous studies characterize individual fairness as a prediction-invariant problem. We show our method resists individual unfairness at a minimal or zero cost to predictive utility.
arXiv Detail & Related papers (2022-11-29T03:32:39Z)
Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data [1.76179873429447]
We propose a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training. In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset.
arXiv Detail & Related papers (2022-10-24T13:04:07Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
Beyond Individual and Group Fairness [90.4666341812857]
We present a new data-driven model of fairness that is guided by the unfairness complaints received by the system. Our model supports multiple fairness criteria and takes into account their potential incompatibilities.
arXiv Detail & Related papers (2020-08-21T14:14:44Z)
Accuracy and Fairness Trade-offs in Machine Learning: A Stochastic Multi-Objective Approach [0.0]
In the application of machine learning to real-life decision-making systems, the prediction outcomes might discriminate against people with sensitive attributes, leading to unfairness. The commonly used strategy in fair machine learning is to include fairness as a constraint or a penalization term in the minimization of the prediction loss. In this paper, we introduce a new approach to handle fairness by formulating a multi-objective optimization problem.
arXiv Detail & Related papers (2020-08-03T18:51:24Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.