Do Fairness Interventions Come at the Cost of Privacy: Evaluations for Binary Classifiers
- URL: http://arxiv.org/abs/2503.06150v2
- Date: Tue, 11 Mar 2025 11:28:18 GMT
- Title: Do Fairness Interventions Come at the Cost of Privacy: Evaluations for Binary Classifiers
- Authors: Huan Tian, Guangsheng Zhang, Bo Liu, Tianqing Zhu, Ming Ding, Wanlei Zhou,
- Abstract summary: We assess the privacy risks of fairness-enhanced binary classifiers via membership inference attacks (MIAs) and attribute inference attacks (AIAs)<n>We uncover a potential threat mechanism that exploits prediction discrepancies between fair and biased models, leading to advanced attack results for both MIAs and AIAs.<n>Our study exposes the under-explored privacy threats in fairness studies, advocating for thorough evaluations of potential security vulnerabilities before model deployments.
- Score: 17.243744418309593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While in-processing fairness approaches show promise in mitigating biased predictions, their potential impact on privacy leakage remains under-explored. We aim to address this gap by assessing the privacy risks of fairness-enhanced binary classifiers via membership inference attacks (MIAs) and attribute inference attacks (AIAs). Surprisingly, our results reveal that enhancing fairness does not necessarily lead to privacy compromises. For example, these fairness interventions exhibit increased resilience against MIAs and AIAs. This is because fairness interventions tend to remove sensitive information among extracted features and reduce confidence scores for the majority of training data for fairer predictions. However, during the evaluations, we uncover a potential threat mechanism that exploits prediction discrepancies between fair and biased models, leading to advanced attack results for both MIAs and AIAs. This mechanism reveals potent vulnerabilities of fair models and poses significant privacy risks of current fairness methods. Extensive experiments across multiple datasets, attack methods, and representative fairness approaches confirm our findings and demonstrate the efficacy of the uncovered mechanism. Our study exposes the under-explored privacy threats in fairness studies, advocating for thorough evaluations of potential security vulnerabilities before model deployments.
Related papers
- Intermediate Outputs Are More Sensitive Than You Think [3.20091188522012]
This paper introduces a novel approach to measuring privacy risks in deep computer vision models based on the Degrees of Freedom (DoF) and sensitivity of intermediate outputs.
We propose a framework that leverages DoF to evaluate the amount of information retained in each layer and combines this with the rank of the Jacobian matrix to assess sensitivity to input variations.
arXiv Detail & Related papers (2024-12-01T06:40:28Z) - Confidence Aware Learning for Reliable Face Anti-spoofing [52.23271636362843]
We propose a Confidence Aware Face Anti-spoofing model, which is aware of its capability boundary.
We estimate its confidence during the prediction of each sample.
Experiments show that the proposed CA-FAS can effectively recognize samples with low prediction confidence.
arXiv Detail & Related papers (2024-11-02T14:29:02Z) - Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models.
We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack.
Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z) - Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An
Adversarial Perspective [31.010937126289953]
We take an adversarial perspective to investigate the potential privacy and fairness risks associated with sharing of diffusion models.
We demonstrate that the sharer can execute fairness poisoning attacks to undermine the receiver's downstream models.
Our experiments conducted on real-world datasets demonstrate remarkable attack performance on different types of diffusion models.
arXiv Detail & Related papers (2024-02-28T12:21:12Z) - When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results.
We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision.
Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z) - When Fairness Meets Privacy: Fair Classification with Semi-Private
Sensitive Attributes [18.221858247218726]
We study a novel and practical problem of fair classification in a semi-private setting.
Most of the sensitive attributes are private and only a small amount of clean ones are available.
We propose a novel framework FairSP that can achieve Fair prediction under the Semi-Private setting.
arXiv Detail & Related papers (2022-07-18T01:10:25Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Systematic Evaluation of Privacy Risks of Machine Learning Models [41.017707772150835]
We show that prior work on membership inference attacks may severely underestimate the privacy risks.
We first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks.
We then introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score.
arXiv Detail & Related papers (2020-03-24T00:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.