"You Can't Fix What You Can't Measure": Privately Measuring Demographic
Performance Disparities in Federated Learning
- URL: http://arxiv.org/abs/2206.12183v2
- Date: Wed, 11 Jan 2023 12:05:43 GMT
- Title: "You Can't Fix What You Can't Measure": Privately Measuring Demographic
Performance Disparities in Federated Learning
- Authors: Marc Juarez and Aleksandra Korolova
- Abstract summary: We propose differentially private mechanisms to measure differences in performance across groups while protecting the privacy of group membership.
Our results show that, contrary to what prior work suggested, protecting privacy is not necessarily in conflict with identifying performance disparities of federated models.
- Score: 78.70083858195906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As in traditional machine learning models, models trained with federated
learning may exhibit disparate performance across demographic groups. Model
holders must identify these disparities to mitigate undue harm to the groups.
However, measuring a model's performance in a group requires access to
information about group membership which, for privacy reasons, often has
limited availability. We propose novel locally differentially private
mechanisms to measure differences in performance across groups while protecting
the privacy of group membership. To analyze the effectiveness of the
mechanisms, we bound their error in estimating a disparity when optimized for a
given privacy budget. Our results show that the error rapidly decreases for
realistic numbers of participating clients, demonstrating that, contrary to
what prior work suggested, protecting privacy is not necessarily in conflict
with identifying performance disparities of federated models.
Related papers
- Analysing Fairness of Privacy-Utility Mobility Models [11.387235721659378]
This work defines a set of fairness metrics designed explicitly for human mobility.
We examine the fairness of two state-of-the-art privacy-preserving models that rely on GAN and representation learning to reduce the re-identification rate of users for data sharing.
Our results show that while both models guarantee group fairness in terms of demographic parity, they violate individual fairness criteria, indicating that users with highly similar trajectories receive disparate privacy gain.
arXiv Detail & Related papers (2023-04-10T11:09:18Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - The Impact of Differential Privacy on Group Disparity Mitigation [28.804933301007644]
We evaluate the impact of differential privacy on fairness across four tasks.
We train $(varepsilon,delta)$-differentially private models with empirical risk minimization.
We find that differential privacy increases between-group performance differences in the baseline setting.
But differential privacy reduces between-group performance differences in the robust setting.
arXiv Detail & Related papers (2022-03-05T13:55:05Z) - Measuring Fairness Under Unawareness of Sensitive Attributes: A
Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes.
We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Improving Fairness and Privacy in Selection Problems [21.293367386282902]
We study the possibility of using a differentially private exponential mechanism as a post-processing step to improve both fairness and privacy of supervised learning models.
We show that the exponential mechanism can improve both privacy and fairness, with a slight decrease in accuracy compared to the model without post-processing.
arXiv Detail & Related papers (2020-12-07T15:55:28Z) - On the Privacy Risks of Algorithmic Fairness [9.429448411561541]
We study the privacy risks of group fairness through the lens of membership inference attacks.
We show that fairness comes at the cost of privacy, and this cost is not distributed equally.
arXiv Detail & Related papers (2020-11-07T09:15:31Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.