Fairness-aware Adversarial Perturbation Towards Bias Mitigation for
Deployed Deep Models
- URL: http://arxiv.org/abs/2203.01584v1
- Date: Thu, 3 Mar 2022 09:26:00 GMT
- Title: Fairness-aware Adversarial Perturbation Towards Bias Mitigation for
Deployed Deep Models
- Authors: Zhibo Wang, Xiaowei Dong, Henry Xue, Zhifei Zhang, Weifeng Chiu, Tao
Wei, Kui Ren
- Abstract summary: Prioritizing fairness is of central importance in artificial intelligence (AI) systems.
We propose a more flexible approach, i.e., fairness-aware adversarial perturbation (FAAP)
FAAP learns to perturb input data to blind deployed models on fairness-related features.
- Score: 32.39167033858135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prioritizing fairness is of central importance in artificial intelligence
(AI) systems, especially for those societal applications, e.g., hiring systems
should recommend applicants equally from different demographic groups, and risk
assessment systems must eliminate racism in criminal justice. Existing efforts
towards the ethical development of AI systems have leveraged data science to
mitigate biases in the training set or introduced fairness principles into the
training process. For a deployed AI system, however, it may not allow for
retraining or tuning in practice. By contrast, we propose a more flexible
approach, i.e., fairness-aware adversarial perturbation (FAAP), which learns to
perturb input data to blind deployed models on fairness-related features, e.g.,
gender and ethnicity. The key advantage is that FAAP does not modify deployed
models in terms of parameters and structures. To achieve this, we design a
discriminator to distinguish fairness-related attributes based on latent
representations from deployed models. Meanwhile, a perturbation generator is
trained against the discriminator, such that no fairness-related features could
be extracted from perturbed inputs. Exhaustive experimental evaluation
demonstrates the effectiveness and superior performance of the proposed FAAP.
In addition, FAAP is validated on real-world commercial deployments
(inaccessible to model parameters), which shows the transferability of FAAP,
foreseeing the potential of black-box adaptation.
Related papers
- Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias [47.79659355705916]
Model-induced distribution shifts (MIDS) occur as previous model outputs pollute new model training sets over generations of models.
We introduce a framework that allows us to track multiple MIDS over many generations, finding that they can lead to loss in performance, fairness, and minoritized group representation.
Despite these negative consequences, we identify how models might be used for positive, intentional, interventions in their data ecosystems.
arXiv Detail & Related papers (2024-03-12T17:48:08Z) - Fairness Explainability using Optimal Transport with Applications in
Image Classification [0.46040036610482665]
We propose a comprehensive approach to uncover the causes of discrimination in Machine Learning applications.
We leverage Wasserstein barycenters to achieve fair predictions and introduce an extension to pinpoint bias-associated regions.
This allows us to derive a cohesive system which uses the enforced fairness to measure each features influence emphon the bias.
arXiv Detail & Related papers (2023-08-22T00:10:23Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - FairIF: Boosting Fairness in Deep Learning via Influence Functions with
Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF.
It minimizes the loss over the reweighted data set where the sample weights are computed.
We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z) - FAIR: Fair Adversarial Instance Re-weighting [0.7829352305480285]
We propose a Fair Adrial Instance Re-weighting (FAIR) method, which uses adversarial training to learn instance weighting function that ensures fair predictions.
To the best of our knowledge, this is the first model that merges reweighting and adversarial approaches by means of a weighting function that can provide interpretable information about fairness of individual instances.
arXiv Detail & Related papers (2020-11-15T10:48:56Z) - FairALM: Augmented Lagrangian Method for Training Fair Models with
Little Regret [42.66567001275493]
It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training will lead to unfair models.
Here, we study mechanisms that impose fairness concurrently while training the model.
arXiv Detail & Related papers (2020-04-03T03:18:53Z) - Fairness by Explicability and Adversarial SHAP Learning [0.0]
We propose a new definition of fairness that emphasises the role of an external auditor and model explicability.
We develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model.
We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset.
arXiv Detail & Related papers (2020-03-11T14:36:34Z) - Fairness-Aware Learning with Prejudice Free Representations [2.398608007786179]
We propose a novel algorithm that can effectively identify and treat latent discriminating features.
The approach helps to collect discrimination-free features that would improve the model performance.
arXiv Detail & Related papers (2020-02-26T10:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.