Revealing Unfair Models by Mining Interpretable Evidence
- URL: http://arxiv.org/abs/2207.05811v1
- Date: Tue, 12 Jul 2022 20:03:08 GMT
- Title: Revealing Unfair Models by Mining Interpretable Evidence
- Authors: Mohit Bajaj, Lingyang Chu, Vittorio Romaniello, Gursimran Singh, Jian
Pei, Zirui Zhou, Lanjun Wang, Yong Zhang
- Abstract summary: The popularity of machine learning has increased the risk of unfair models getting deployed in high-stake applications.
In this paper, we tackle the novel task of revealing unfair models by mining interpretable evidence.
Our method finds highly interpretable and solid evidence to effectively reveal the unfairness of trained models.
- Score: 50.48264727620845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The popularity of machine learning has increased the risk of unfair models
getting deployed in high-stake applications, such as justice system,
drug/vaccination design, and medical diagnosis. Although there are effective
methods to train fair models from scratch, how to automatically reveal and
explain the unfairness of a trained model remains a challenging task. Revealing
unfairness of machine learning models in interpretable fashion is a critical
step towards fair and trustworthy AI. In this paper, we systematically tackle
the novel task of revealing unfair models by mining interpretable evidence
(RUMIE). The key idea is to find solid evidence in the form of a group of data
instances discriminated most by the model. To make the evidence interpretable,
we also find a set of human-understandable key attributes and decision rules
that characterize the discriminated data instances and distinguish them from
the other non-discriminated data. As demonstrated by extensive experiments on
many real-world data sets, our method finds highly interpretable and solid
evidence to effectively reveal the unfairness of trained models. Moreover, it
is much more scalable than all of the baseline methods.
Related papers
- Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - fairmodels: A Flexible Tool For Bias Detection, Visualization, And
Mitigation [3.548416925804316]
This article introduces an R package fairmodels that helps to validate fairness and eliminate bias in classification models.
The implemented set of functions and fairness metrics enables model fairness validation from different perspectives.
The package includes a series of methods for bias mitigation that aim to diminish the discrimination in the model.
arXiv Detail & Related papers (2021-04-01T15:06:13Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - FairALM: Augmented Lagrangian Method for Training Fair Models with
Little Regret [42.66567001275493]
It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training will lead to unfair models.
Here, we study mechanisms that impose fairness concurrently while training the model.
arXiv Detail & Related papers (2020-04-03T03:18:53Z) - Fairness-Aware Learning with Prejudice Free Representations [2.398608007786179]
We propose a novel algorithm that can effectively identify and treat latent discriminating features.
The approach helps to collect discrimination-free features that would improve the model performance.
arXiv Detail & Related papers (2020-02-26T10:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.