Fair Regression with Wasserstein Barycenters
- URL: http://arxiv.org/abs/2006.07286v2
- Date: Tue, 23 Jun 2020 13:22:01 GMT
- Title: Fair Regression with Wasserstein Barycenters
- Authors: Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto,
Massimiliano Pontil
- Abstract summary: We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint.
It demands the distribution of the predicted output to be independent of the sensitive attribute.
We establish a connection between fair regression and optimal transport theory, based on which we derive a close form expression for the optimal fair predictor.
- Score: 39.818025466204055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning a real-valued function that satisfies the
Demographic Parity constraint. It demands the distribution of the predicted
output to be independent of the sensitive attribute. We consider the case that
the sensitive attribute is available for prediction. We establish a connection
between fair regression and optimal transport theory, based on which we derive
a close form expression for the optimal fair predictor. Specifically, we show
that the distribution of this optimum is the Wasserstein barycenter of the
distributions induced by the standard regression function on the sensitive
groups. This result offers an intuitive interpretation of the optimal fair
prediction and suggests a simple post-processing algorithm to achieve fairness.
We establish risk and distribution-free fairness guarantees for this procedure.
Numerical experiments indicate that our method is very effective in learning
fair models, with a relative increase in error rate that is inferior to the
relative gain in fairness.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Conformalized Fairness via Quantile Regression [8.180169144038345]
We propose a novel framework to learn a real-valued quantile function under the fairness requirement of Demographic Parity.
We establish theoretical guarantees of distribution-free coverage and exact fairness for the induced prediction interval constructed by fair quantiles.
Our results show the model's ability to uncover the mechanism underlying the fairness-accuracy trade-off in a wide range of societal and medical applications.
arXiv Detail & Related papers (2022-10-05T04:04:15Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Costs and Benefits of Wasserstein Fair Regression [11.134279147254361]
In this paper, we characterize the inherent tradeoff between statistical parity and accuracy in the regression setting.
Our lower bound is sharp, algorithm-independent, and admits a simple interpretation.
We develop a practical algorithm for fair regression through the lens of representation learning.
arXiv Detail & Related papers (2021-06-16T14:24:44Z) - Pairwise Fairness for Ordinal Regression [22.838858781036574]
We adapt two fairness notions previously considered in fair ranking and propose a strategy for training a predictor that is approximately fair according to either notion.
Our predictor consists of a threshold model, composed of a scoring function and a set of thresholds.
We show that our strategy allows us to effectively explore the accuracy-vs-fairness trade-off and that it often compares favorably to "unfair" state-of-the-art methods for ordinal regression.
arXiv Detail & Related papers (2021-05-07T10:33:42Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z) - A Distributionally Robust Approach to Fair Classification [17.759493152879013]
We propose a robust logistic regression model with an unfairness penalty that prevents discrimination with respect to sensitive attributes such as gender or ethnicity.
This model is equivalent to a tractable convex optimization problem if a Wasserstein ball centered at the empirical distribution on the training data is used to model distributional uncertainty.
We demonstrate that the resulting classifier improves fairness at a marginal loss of predictive accuracy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-07-18T22:34:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.