Improving Robust Fairness via Balance Adversarial Training
- URL: http://arxiv.org/abs/2209.07534v1
- Date: Thu, 15 Sep 2022 14:44:48 GMT
- Title: Improving Robust Fairness via Balance Adversarial Training
- Authors: Chunyu Sun, Chenye Xu, Chengyuan Yao, Siyuan Liang, Yichao Wu, Ding
Liang, XiangLong Liu, Aishan Liu
- Abstract summary: Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes.
We propose Adversarial Training (BAT) to address the robust fairness problem.
- Score: 51.67643171193376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training (AT) methods are effective against adversarial attacks,
yet they introduce severe disparity of accuracy and robustness between
different classes, known as the robust fairness problem. Previously proposed
Fair Robust Learning (FRL) adaptively reweights different classes to improve
fairness. However, the performance of the better-performed classes decreases,
leading to a strong performance drop. In this paper, we observed two unfair
phenomena during adversarial training: different difficulties in generating
adversarial examples from each class (source-class fairness) and disparate
target class tendencies when generating adversarial examples (target-class
fairness). From the observations, we propose Balance Adversarial Training (BAT)
to address the robust fairness problem. Regarding source-class fairness, we
adjust the attack strength and difficulties of each class to generate samples
near the decision boundary for easier and fairer model learning; considering
target-class fairness, by introducing a uniform distribution constraint, we
encourage the adversarial example generation process for each class with a fair
tendency. Extensive experiments conducted on multiple datasets (CIFAR-10,
CIFAR-100, and ImageNette) demonstrate that our method can significantly
outperform other baselines in mitigating the robust fairness problem (+5-10\%
on the worst class accuracy)
Related papers
- Towards Fairness-Aware Adversarial Learning [13.932705960012846]
We propose a novel learning paradigm, named Fairness-Aware Adversarial Learning (FAAL)
Our method aims to find the worst distribution among different categories, and the solution is guaranteed to obtain the upper bound performance with high probability.
In particular, FAAL can fine-tune an unfair robust model to be fair within only two epochs, without compromising the overall clean and robust accuracies.
arXiv Detail & Related papers (2024-02-27T18:01:59Z) - DAFA: Distance-Aware Fair Adversarial Training [34.94780532071229]
Under adversarial attacks, the majority of the model's predictions for samples from the worst class are biased towards classes similar to the worst class.
We introduce the Distance-Aware Fair Adversarial training (DAFA) methodology, which addresses robust fairness by taking into account the similarities between classes.
arXiv Detail & Related papers (2024-01-23T07:15:47Z) - Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation [14.163463596459064]
Adversarial Training (AT) has been widely proved to be an effective method to improve the adversarial robustness against adversarial examples for Deep Neural Networks (DNNs)
As a variant of AT, Adversarial Robustness Distillation (ARD) has demonstrated its superior performance in improving the robustness of small student models.
We propose an Anti-Bias Soft Label Distillation (ABSLD) method to mitigate the adversarial robust fairness problem within the framework of Knowledge Distillation (KD)
arXiv Detail & Related papers (2023-12-09T09:08:03Z) - CFA: Class-wise Calibrated Fair Adversarial Training [31.812287233814295]
We propose a textbfClass-wise calibrated textbfFair textbfAdversarial training framework, named CFA, which customizes specific training configurations for each class automatically.
Our proposed CFA can improve both overall robustness and fairness notably over other state-of-the-art methods.
arXiv Detail & Related papers (2023-03-25T13:05:16Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Towards Equal Opportunity Fairness through Adversarial Learning [64.45845091719002]
Adversarial training is a common approach for bias mitigation in natural language processing.
We propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features.
arXiv Detail & Related papers (2022-03-12T02:22:58Z) - Analysis and Applications of Class-wise Robustness in Adversarial
Training [92.08430396614273]
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
Previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing.
We provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet.
We observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes.
arXiv Detail & Related papers (2021-05-29T07:28:35Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data.
We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.