Certifying the Fairness of KNN in the Presence of Dataset Bias
- URL: http://arxiv.org/abs/2307.08722v1
- Date: Mon, 17 Jul 2023 07:09:55 GMT
- Title: Certifying the Fairness of KNN in the Presence of Dataset Bias
- Authors: Yannan Li, Jingbo Wang, and Chao Wang
- Abstract summary: We propose a method for certifying the fairness of the classification result of a widely used supervised learning algorithm, the k-nearest neighbors (KNN)
This is the first certification method for KNN based on three variants of the fairness definition: individual fairness, $epsilon$-fairness, and label-flipping fairness.
We show effectiveness of this abstract interpretation based technique through experimental evaluation on six datasets widely used in the fairness research literature.
- Score: 8.028344363418865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a method for certifying the fairness of the classification result
of a widely used supervised learning algorithm, the k-nearest neighbors (KNN),
under the assumption that the training data may have historical bias caused by
systematic mislabeling of samples from a protected minority group. To the best
of our knowledge, this is the first certification method for KNN based on three
variants of the fairness definition: individual fairness, $\epsilon$-fairness,
and label-flipping fairness. We first define the fairness certification problem
for KNN and then propose sound approximations of the complex arithmetic
computations used in the state-of-the-art KNN algorithm. This is meant to lift
the computation results from the concrete domain to an abstract domain, to
reduce the computational cost. We show effectiveness of this abstract
interpretation based technique through experimental evaluation on six datasets
widely used in the fairness research literature. We also show that the method
is accurate enough to obtain fairness certifications for a large number of test
inputs, despite the presence of historical bias in the datasets.
Related papers
- FairQuant: Certifying and Quantifying Fairness of Deep Neural Networks [6.22084835644296]
We propose a method for formally certifying and quantifying individual fairness of deep neural networks (DNN)
Individual fairness guarantees that any two individuals who are identical except for a legally protected attribute (e.g., gender or race) receive the same treatment.
We have implemented our method and evaluated it on four popular fairness research datasets.
arXiv Detail & Related papers (2024-09-05T03:36:05Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Provable Fairness for Neural Network Models using Formal Verification [10.90121002896312]
We propose techniques to emphprove fairness using recently developed formal methods that verify properties of neural network models.
We show that through proper training, we can reduce unfairness by an average of 65.4% at a cost of less than 1% in AUC score.
arXiv Detail & Related papers (2022-12-16T16:54:37Z) - Practical Approaches for Fair Learning with Multitype and Multivariate
Sensitive Attributes [70.6326967720747]
It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences.
We introduce FairCOCCO, a fairness measure built on cross-covariance operators on reproducing kernel Hilbert Spaces.
We empirically demonstrate consistent improvements against state-of-the-art techniques in balancing predictive power and fairness on real-world datasets.
arXiv Detail & Related papers (2022-11-11T11:28:46Z) - FARE: Provably Fair Representation Learning with Practical Certificates [9.242965489146398]
We introduce FARE, the first FRL method with practical fairness certificates.
FARE is based on our key insight that restricting the representation space of the encoder enables the derivation of practical guarantees.
We show that FARE produces practical certificates that are tight and often even comparable with purely empirical results.
arXiv Detail & Related papers (2022-10-13T17:40:07Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Fairness-Aware Naive Bayes Classifier for Data with Multiple Sensitive
Features [0.0]
We generalise two-naive-Bayes (2NB) into N-naive-Bayes (NNB) to eliminate the simplification of assuming only two sensitive groups in the data.
We investigate its application on data with multiple sensitive features and propose a new constraint and post-processing routine to enforce differential fairness.
arXiv Detail & Related papers (2022-02-23T13:32:21Z) - Fairness via Representation Neutralization [60.90373932844308]
We propose a new mitigation technique, namely, Representation Neutralization for Fairness (RNF)
RNF achieves that fairness by debiasing only the task-specific classification head of DNN models.
Experimental results over several benchmark datasets demonstrate our RNF framework to effectively reduce discrimination of DNN models.
arXiv Detail & Related papers (2021-06-23T22:26:29Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.