Fair mapping
- URL: http://arxiv.org/abs/2209.00617v1
- Date: Thu, 1 Sep 2022 17:31:27 GMT
- Title: Fair mapping
- Authors: S\'ebastien Gambs and Rosin Claude Ngueveu
- Abstract summary: We propose a novel pre-processing method based on the transformation of the distribution of protected groups onto a chosen target one.
We leverage on the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the optimal transport of data points.
Our proposed approach, preserves the interpretability of data and can be used without defining exactly the sensitive groups.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To mitigate the effects of undesired biases in models, several approaches
propose to pre-process the input dataset to reduce the risks of discrimination
by preventing the inference of sensitive attributes. Unfortunately, most of
these pre-processing methods lead to the generation a new distribution that is
very different from the original one, thus often leading to unrealistic data.
As a side effect, this new data distribution implies that existing models need
to be re-trained to be able to make accurate predictions. To address this
issue, we propose a novel pre-processing method, that we coin as fair mapping,
based on the transformation of the distribution of protected groups onto a
chosen target one, with additional privacy constraints whose objective is to
prevent the inference of sensitive attributes. More precisely, we leverage on
the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the
optimal transport of data points coupled with a discriminator enforcing the
protection against attribute inference. Our proposed approach, preserves the
interpretability of data and can be used without defining exactly the sensitive
groups. In addition, our approach can be specialized to model existing
state-of-the-art approaches, thus proposing a unifying view on these methods.
Finally, several experiments on real and synthetic datasets demonstrate that
our approach is able to hide the sensitive attributes, while limiting the
distortion of the data and improving the fairness on subsequent data analysis
tasks.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Align, Minimize and Diversify: A Source-Free Unsupervised Domain Adaptation Method for Handwritten Text Recognition [11.080302144256164]
The Align, Minimize and Diversify (AMD) method is a Source-Free Unsupervised Domain Adaptation approach for Handwritten Text Recognition (HTR)
Our method explicitly eliminates the need to revisit the source data during adaptation by incorporating three distinct regularization terms.
Experimental results from several benchmarks demonstrated the effectiveness and robustness of AMD, showing it to be competitive and often outperforming DA methods in HTR.
arXiv Detail & Related papers (2024-04-28T17:50:58Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Unsupervised Accuracy Estimation of Deep Visual Models using
Domain-Adaptive Adversarial Perturbation without Source Samples [1.1852406625172216]
We propose a new framework to estimate model accuracy on unlabeled target data without access to source data.
Our approach measures the disagreement rate between the source hypothesis and the target pseudo-labeling function.
Our proposed source-free framework effectively addresses the challenging distribution shift scenarios and outperforms existing methods requiring source data and labels for training.
arXiv Detail & Related papers (2023-07-19T15:33:11Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Simultaneous Improvement of ML Model Fairness and Performance by
Identifying Bias in Data [1.76179873429447]
We propose a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training.
In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset.
arXiv Detail & Related papers (2022-10-24T13:04:07Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.