Balancing Fairness and Accuracy in Data-Restricted Binary Classification
- URL: http://arxiv.org/abs/2403.07724v1
- Date: Tue, 12 Mar 2024 15:01:27 GMT
- Title: Balancing Fairness and Accuracy in Data-Restricted Binary Classification
- Authors: Zachary McBride Lazri, Danial Dervovic, Antigoni Polychroniadou, Ivan
Brugere, Dana Dachman-Soled, and Min Wu
- Abstract summary: This paper proposes a framework that models the trade-off between accuracy and fairness under four practical scenarios.
Experiments on three datasets demonstrate the utility of the proposed framework as a tool for quantifying the trade-offs.
- Score: 14.439413517433891
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Applications that deal with sensitive information may have restrictions
placed on the data available to a machine learning (ML) classifier. For
example, in some applications, a classifier may not have direct access to
sensitive attributes, affecting its ability to produce accurate and fair
decisions. This paper proposes a framework that models the trade-off between
accuracy and fairness under four practical scenarios that dictate the type of
data available for analysis. Prior works examine this trade-off by analyzing
the outputs of a scoring function that has been trained to implicitly learn the
underlying distribution of the feature vector, class label, and sensitive
attribute of a dataset. In contrast, our framework directly analyzes the
behavior of the optimal Bayesian classifier on this underlying distribution by
constructing a discrete approximation it from the dataset itself. This approach
enables us to formulate multiple convex optimization problems, which allow us
to answer the question: How is the accuracy of a Bayesian classifier affected
in different data restricting scenarios when constrained to be fair? Analysis
is performed on a set of fairness definitions that include group and individual
fairness. Experiments on three datasets demonstrate the utility of the proposed
framework as a tool for quantifying the trade-offs among different fairness
notions and their distributional dependencies.
Related papers
- Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Explaining Cross-Domain Recognition with Interpretable Deep Classifier [100.63114424262234]
Interpretable Deep (IDC) learns the nearest source samples of a target sample as evidence upon which the classifier makes the decision.
Our IDC leads to a more explainable model with almost no accuracy degradation and effectively calibrates classification for optimum reject options.
arXiv Detail & Related papers (2022-11-15T15:58:56Z) - Simultaneous Improvement of ML Model Fairness and Performance by
Identifying Bias in Data [1.76179873429447]
We propose a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training.
In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset.
arXiv Detail & Related papers (2022-10-24T13:04:07Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Evaluating Fairness of Machine Learning Models Under Uncertain and
Incomplete Information [25.739240011015923]
We show that the test accuracy of the attribute classifier is not always correlated with its effectiveness in bias estimation for a downstream model.
Our analysis has surprising and counter-intuitive implications where in certain regimes one might want to distribute the error of the attribute classifier as unevenly as possible.
arXiv Detail & Related papers (2021-02-16T19:02:55Z) - Robust Fairness under Covariate Shift [11.151913007808927]
Making predictions that are fair with regard to protected group membership has become an important requirement for classification algorithms.
We propose an approach that obtains the predictor that is robust to the worst-case in terms of target performance.
arXiv Detail & Related papers (2020-10-11T04:42:01Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.