Fairness and Robustness in Invariant Learning: A Case Study in Toxicity
Classification
- URL: http://arxiv.org/abs/2011.06485v2
- Date: Wed, 2 Dec 2020 02:21:12 GMT
- Title: Fairness and Robustness in Invariant Learning: A Case Study in Toxicity
Classification
- Authors: Robert Adragna, Elliot Creager, David Madras, Richard Zemel
- Abstract summary: Invariant Risk Minimization (IRM) is a domain generalization algorithm that employs a causal discovery inspired method to find robust predictors.
We show that IRM achieves better out-of-distribution accuracy and fairness than Empirical Risk Minimization (ERM) methods.
- Score: 13.456851070400024
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robustness is of central importance in machine learning and has given rise to
the fields of domain generalization and invariant learning, which are concerned
with improving performance on a test distribution distinct from but related to
the training distribution. In light of recent work suggesting an intimate
connection between fairness and robustness, we investigate whether algorithms
from robust ML can be used to improve the fairness of classifiers that are
trained on biased data and tested on unbiased data. We apply Invariant Risk
Minimization (IRM), a domain generalization algorithm that employs a causal
discovery inspired method to find robust predictors, to the task of fairly
predicting the toxicity of internet comments. We show that IRM achieves better
out-of-distribution accuracy and fairness than Empirical Risk Minimization
(ERM) methods, and analyze both the difficulties that arise when applying IRM
in practice and the conditions under which IRM will likely be effective in this
scenario. We hope that this work will inspire further studies of how robust
machine learning methods relate to algorithmic fairness.
Related papers
- FAIRM: Learning invariant representations for algorithmic fairness and domain generalization with minimax optimality [15.71499916304475]
We propose a training environment-based oracle, FAIRM, which has desirable fairness and domain generalization properties under a diversity-type condition.
We develop efficient algorithms to realize FAIRM in linear models and demonstrate the nonasymptotic performance with minimax optimality.
arXiv Detail & Related papers (2024-04-02T03:06:25Z) - Provable Risk-Sensitive Distributional Reinforcement Learning with
General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation.
We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z) - Learning Optimal Features via Partial Invariance [18.552839725370383]
Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments.
We show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $textitpartial invariance$.
Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
arXiv Detail & Related papers (2023-01-28T02:48:14Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Distributionally Robust Learning with Stable Adversarial Training [34.74504615726101]
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts.
We propose a novel Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set.
arXiv Detail & Related papers (2021-06-30T03:05:45Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Meta-Learned Invariant Risk Minimization [12.6484257912092]
Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD)
In this paper, we propose a novel meta-learning based approach for IRM.
We show that our algorithm not only has better OOD generalization performance than IRMv1 and all IRM variants, but also addresses the weakness of IRMv1 with improved stability.
arXiv Detail & Related papers (2021-03-24T02:52:48Z) - Learning Calibrated Uncertainties for Domain Shift: A Distributionally
Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts.
In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution.
We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z) - Stable Adversarial Learning under Distributional Shifts [46.98655899839784]
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts.
We propose Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set.
arXiv Detail & Related papers (2020-06-08T08:42:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.