Differentially Private Post-Processing for Fair Regression
- URL: http://arxiv.org/abs/2405.04034v1
- Date: Tue, 7 May 2024 06:09:37 GMT
- Title: Differentially Private Post-Processing for Fair Regression
- Authors: Ruicheng Xian, Qiaobo Li, Gautam Kamath, Han Zhao,
- Abstract summary: Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs.
We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram.
- Score: 13.855474876965557
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.
Related papers
- Optimal Group Fair Classifiers from Linear Post-Processing [10.615965454674901]
We propose a post-processing algorithm for fair classification that mitigates model bias under a unified family of group fairness criteria.
It achieves fairness by re-calibrating the output score of the given base model with a "fairness cost" -- a linear combination of the (predicted) group memberships.
arXiv Detail & Related papers (2024-05-07T05:58:44Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Regression with Label Differential Privacy [64.21020761920322]
We derive a label DP randomization mechanism that is optimal under a given regression loss function.
We prove that the optimal mechanism takes the form of a "randomized response on bins"
arXiv Detail & Related papers (2022-12-12T17:41:32Z) - Post-processing for Individual Fairness [23.570995756189266]
Post-processing in algorithmic fairness is a versatile approach for correcting bias in ML systems that are already used in production.
We consider a setting where the learner only has access to the predictions of the original model and a similarity graph between individuals, guiding the desired fairness constraints.
Our algorithms correct individual biases in large-scale NLP models such as BERT, while preserving accuracy.
arXiv Detail & Related papers (2021-10-26T15:51:48Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z) - A Distributionally Robust Approach to Fair Classification [17.759493152879013]
We propose a robust logistic regression model with an unfairness penalty that prevents discrimination with respect to sensitive attributes such as gender or ethnicity.
This model is equivalent to a tractable convex optimization problem if a Wasserstein ball centered at the empirical distribution on the training data is used to model distributional uncertainty.
We demonstrate that the resulting classifier improves fairness at a marginal loss of predictive accuracy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-07-18T22:34:48Z) - SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness [50.916483212900275]
We first formulate a version of individual fairness that enforces invariance on certain sensitive sets.
We then design a transport-based regularizer that enforces this version of individual fairness and develop an algorithm to minimize the regularizer efficiently.
arXiv Detail & Related papers (2020-06-25T04:31:57Z) - Distributionally-Robust Machine Learning Using Locally
Differentially-Private Data [14.095523601311374]
We consider machine learning, particularly regression, using locally-differentially private datasets.
We show that machine learning with locally-differentially private datasets can be rewritten as a distributionally-robust optimization.
arXiv Detail & Related papers (2020-06-24T05:12:10Z) - Fair Regression with Wasserstein Barycenters [39.818025466204055]
We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint.
It demands the distribution of the predicted output to be independent of the sensitive attribute.
We establish a connection between fair regression and optimal transport theory, based on which we derive a close form expression for the optimal fair predictor.
arXiv Detail & Related papers (2020-06-12T16:10:41Z) - Distributed Sketching Methods for Privacy Preserving Regression [54.51566432934556]
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches.
We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.
arXiv Detail & Related papers (2020-02-16T08:35:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.