DAFA: Distance-Aware Fair Adversarial Training
- URL: http://arxiv.org/abs/2401.12532v1
- Date: Tue, 23 Jan 2024 07:15:47 GMT
- Title: DAFA: Distance-Aware Fair Adversarial Training
- Authors: Hyungyu Lee, Saehyung Lee, Hyemi Jang, Junsung Park, Ho Bae, Sungroh
Yoon
- Abstract summary: Under adversarial attacks, the majority of the model's predictions for samples from the worst class are biased towards classes similar to the worst class.
We introduce the Distance-Aware Fair Adversarial training (DAFA) methodology, which addresses robust fairness by taking into account the similarities between classes.
- Score: 34.94780532071229
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The disparity in accuracy between classes in standard training is amplified
during adversarial training, a phenomenon termed the robust fairness problem.
Existing methodologies aimed to enhance robust fairness by sacrificing the
model's performance on easier classes in order to improve its performance on
harder ones. However, we observe that under adversarial attacks, the majority
of the model's predictions for samples from the worst class are biased towards
classes similar to the worst class, rather than towards the easy classes.
Through theoretical and empirical analysis, we demonstrate that robust fairness
deteriorates as the distance between classes decreases. Motivated by these
insights, we introduce the Distance-Aware Fair Adversarial training (DAFA)
methodology, which addresses robust fairness by taking into account the
similarities between classes. Specifically, our method assigns distinct loss
weights and adversarial margins to each class and adjusts them to encourage a
trade-off in robustness among similar classes. Experimental results across
various datasets demonstrate that our method not only maintains average robust
accuracy but also significantly improves the worst robust accuracy, indicating
a marked improvement in robust fairness compared to existing methods.
Related papers
- FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training [16.10247754923311]
We introduce a novel approach called Fair Targeted Adversarial Training (FAIR-TAT)
We show that using targeted adversarial attacks for adversarial training (instead of untargeted attacks) can allow for more favorable trade-offs with respect to adversarial fairness.
arXiv Detail & Related papers (2024-10-30T15:58:03Z) - Learning Confidence Bounds for Classification with Imbalanced Data [42.690254618937196]
We propose a novel framework that leverages learning theory and concentration inequalities to overcome the shortcomings of traditional solutions.
Our method can effectively adapt to the varying degrees of imbalance across different classes, resulting in more robust and reliable classification outcomes.
arXiv Detail & Related papers (2024-07-16T16:02:27Z) - Towards Fairness-Aware Adversarial Learning [13.932705960012846]
We propose a novel learning paradigm, named Fairness-Aware Adversarial Learning (FAAL)
Our method aims to find the worst distribution among different categories, and the solution is guaranteed to obtain the upper bound performance with high probability.
In particular, FAAL can fine-tune an unfair robust model to be fair within only two epochs, without compromising the overall clean and robust accuracies.
arXiv Detail & Related papers (2024-02-27T18:01:59Z) - Improving Robust Fairness via Balance Adversarial Training [51.67643171193376]
Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes.
We propose Adversarial Training (BAT) to address the robust fairness problem.
arXiv Detail & Related papers (2022-09-15T14:44:48Z) - Towards Equal Opportunity Fairness through Adversarial Learning [64.45845091719002]
Adversarial training is a common approach for bias mitigation in natural language processing.
We propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features.
arXiv Detail & Related papers (2022-03-12T02:22:58Z) - Fairness-aware Class Imbalanced Learning [57.45784950421179]
We evaluate long-tail learning methods for tweet sentiment and occupation classification.
We extend a margin-loss based approach with methods to enforce fairness.
arXiv Detail & Related papers (2021-09-21T22:16:30Z) - Analysis and Applications of Class-wise Robustness in Adversarial
Training [92.08430396614273]
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
Previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing.
We provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet.
We observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes.
arXiv Detail & Related papers (2021-05-29T07:28:35Z) - Robustness May Be at Odds with Fairness: An Empirical Study on
Class-wise Accuracy [85.20742045853738]
CNNs are widely known to be vulnerable to adversarial attacks.
We propose an empirical study on the class-wise accuracy and robustness of adversarially trained models.
We find that there exists inter-class discrepancy for accuracy and robustness even when the training dataset has an equal number of samples for each class.
arXiv Detail & Related papers (2020-10-26T06:32:32Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.