On the Vulnerability of Fairness Constrained Learning to Malicious Noise
- URL: http://arxiv.org/abs/2307.11892v3
- Date: Thu, 22 Aug 2024 17:48:33 GMT
- Title: On the Vulnerability of Fairness Constrained Learning to Malicious Noise
- Authors: Avrim Blum, Princewill Okoroafor, Aadirupa Saha, Kevin Stangl,
- Abstract summary: We consider the vulnerability of fairness-constrained learning to small amounts of malicious noise in the training data.
For example, for Demographic Parity we show we can incur only a $Theta(alpha)$ loss in accuracy, where $alpha$ is the malicious noise rate.
For Equal Opportunity, we show we can incur an $O(sqrtalpha)$ loss, and give a matching $Omega(sqrtalpha)$ lower bound.
- Score: 28.176039923404883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the vulnerability of fairness-constrained learning to small amounts of malicious noise in the training data. Konstantinov and Lampert (2021) initiated the study of this question and presented negative results showing there exist data distributions where for several fairness constraints, any proper learner will exhibit high vulnerability when group sizes are imbalanced. Here, we present a more optimistic view, showing that if we allow randomized classifiers, then the landscape is much more nuanced. For example, for Demographic Parity we show we can incur only a $\Theta(\alpha)$ loss in accuracy, where $\alpha$ is the malicious noise rate, matching the best possible even without fairness constraints. For Equal Opportunity, we show we can incur an $O(\sqrt{\alpha})$ loss, and give a matching $\Omega(\sqrt{\alpha})$lower bound. In contrast, Konstantinov and Lampert (2021) showed for proper learners the loss in accuracy for both notions is $\Omega(1)$. The key technical novelty of our work is how randomization can bypass simple "tricks" an adversary can use to amplify his power. We also consider additional fairness notions including Equalized Odds and Calibration. For these fairness notions, the excess accuracy clusters into three natural regimes $O(\alpha)$,$O(\sqrt{\alpha})$ and $O(1)$. These results provide a more fine-grained view of the sensitivity of fairness-constrained learning to adversarial noise in training data.
Related papers
- Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness [13.802167452101909]
Prior work established that the optimal error under such instance-targeted poisoning attacks scales as $Theta(deta)$.<n>We show that the optimal excess error is $tildeTheta(sqrtdeta)$, answering one of the main open problems left by Hanneke et al.
arXiv Detail & Related papers (2025-06-03T16:53:20Z) - Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits [49.96531901205305]
We analyze $f$-divergence-regularized offline policy learning.<n>For reverse Kullback-Leibler (KL) divergence, we give the first $tildeO(epsilon-1)$ sample complexity under single-policy concentrability.<n>We extend our analysis to dueling bandits, and we believe these results take a significant step toward a comprehensive understanding of $f$-divergence-regularized policy learning.
arXiv Detail & Related papers (2025-02-09T22:14:45Z) - SPLITZ: Certifiable Robustness via Split Lipschitz Randomized Smoothing [8.471466670802817]
SPLITZ is a practical and novel approach to provide certifiable robustness to adversarial examples.
Motivation for SPLITZ comes from the observation that many standard deep networks exhibit heterogeneity in Lipschitz constants.
We show that SPLITZ consistently improves on existing state-of-the-art approaches in the MNIST, CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-07-03T05:13:28Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples.
For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class.
We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z) - Releasing Inequality Phenomenon in $\ell_{\infty}$-norm Adversarial Training via Input Gradient Distillation [66.5912840038179]
A recent study revealed that (ell_infty)-norm adversarial training ((ell_infty)-AT) will induce unevenly distributed input gradients.<n>This phenomenon makes the (ell_infty)-norm adversarially trained model more vulnerable than the standard-trained model.<n>We propose a simple yet effective method called Input Gradient Distillation (IGD) to release the inequality phenomenon in $ell_infty$-AT.
arXiv Detail & Related papers (2023-05-16T09:23:42Z) - Certified Adversarial Robustness Within Multiple Perturbation Bounds [38.3813286696956]
Randomized smoothing (RS) is a well known certified defense against adversarial attacks.
In this work, we aim to improve the certified adversarial robustness against multiple perturbation bounds simultaneously.
arXiv Detail & Related papers (2023-04-20T16:42:44Z) - Linear Contextual Bandits with Adversarial Corruptions [91.38793800392108]
We study the linear contextual bandit problem in the presence of adversarial corruption.
We present a variance-aware algorithm that is adaptive to the level of adversarial contamination $C$.
arXiv Detail & Related papers (2021-10-25T02:53:24Z) - Fair Classification with Adversarial Perturbations [35.030329189029246]
We study fair classification in the presence of an omniscient adversary that, given an $eta$, is allowed to choose an arbitrary $eta$-fraction of the training samples and arbitrarily perturb their protected attributes.
Our main contribution is an optimization framework to learn fair classifiers in this adversarial setting that comes with provable guarantees on accuracy and fairness.
We prove near-tightness of our framework's guarantees for natural hypothesis classes: no algorithm can have significantly better accuracy and any algorithm with better fairness must have lower accuracy.
arXiv Detail & Related papers (2021-06-10T17:56:59Z) - On the robustness of randomized classifiers to adversarial examples [11.359085303200981]
We introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics.
We show that our results are applicable to a wide range of machine learning models under mild hypotheses.
All robust models we trained models can simultaneously achieve state-of-the-art accuracy.
arXiv Detail & Related papers (2021-02-22T10:16:58Z) - Robustness, Privacy, and Generalization of Adversarial Training [84.38148845727446]
This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training.
We show that adversarial training is $(varepsilon, delta)$-differentially private, where the magnitude of the differential privacy has a positive correlation with the robustified intensity.
Our generalization bounds do not explicitly rely on the parameter size which would be large in deep learning.
arXiv Detail & Related papers (2020-12-25T13:35:02Z) - Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches.
Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z) - Towards Deep Learning Models Resistant to Large Perturbations [0.0]
Adversarial robustness has proven to be a required property of machine learning algorithms.
We show that the well-established algorithm called "adversarial training" fails to train a deep neural network given a large, but reasonable, perturbation magnitude.
arXiv Detail & Related papers (2020-03-30T12:03:09Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.