Towards Understanding Dual BN In Hybrid Adversarial Training
- URL: http://arxiv.org/abs/2403.19150v1
- Date: Thu, 28 Mar 2024 05:08:25 GMT
- Title: Towards Understanding Dual BN In Hybrid Adversarial Training
- Authors: Chenshuang Zhang, Chaoning Zhang, Kang Zhang, Axi Niu, Junmo Kim, In So Kweon,
- Abstract summary: We show that disentangling statistics plays a less role than disentangling affine parameters in model training.
We propose a two-task hypothesis which serves as the empirical foundation and a unified framework for Hybrid-AT improvement.
- Score: 79.92394747290905
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a growing concern about applying batch normalization (BN) in adversarial training (AT), especially when the model is trained on both adversarial samples and clean samples (termed Hybrid-AT). With the assumption that adversarial and clean samples are from two different domains, a common practice in prior works is to adopt Dual BN, where BN and BN are used for adversarial and clean branches, respectively. A popular belief for motivating Dual BN is that estimating normalization statistics of this mixture distribution is challenging and thus disentangling it for normalization achieves stronger robustness. In contrast to this belief, we reveal that disentangling statistics plays a less role than disentangling affine parameters in model training. This finding aligns with prior work (Rebuffi et al., 2023), and we build upon their research for further investigations. We demonstrate that the domain gap between adversarial and clean samples is not very large, which is counter-intuitive considering the significant influence of adversarial perturbation on the model accuracy. We further propose a two-task hypothesis which serves as the empirical foundation and a unified framework for Hybrid-AT improvement. We also investigate Dual BN in test-time and reveal that affine parameters characterize the robustness during inference. Overall, our work sheds new light on understanding the mechanism of Dual BN in Hybrid-AT and its underlying justification.
Related papers
- Towards Distribution-Agnostic Generalized Category Discovery [51.52673017664908]
Data imbalance and open-ended distribution are intrinsic characteristics of the real visual world.
We propose a Self-Balanced Co-Advice contrastive framework (BaCon)
BaCon consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task.
arXiv Detail & Related papers (2023-10-02T17:39:58Z) - Explicit Tradeoffs between Adversarial and Natural Distributional
Robustness [48.44639585732391]
In practice, models need to enjoy both types of robustness to ensure reliability.
In this work, we show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness.
arXiv Detail & Related papers (2022-09-15T19:58:01Z) - Unifying Model Explainability and Robustness for Joint Text
Classification and Rationale Extraction [11.878012909876713]
We propose a joint classification and rationale extraction model named AT-BMC.
It includes two key mechanisms: mixed Adversarial Training (AT) is designed to use various perturbations in discrete and embedding space to improve the model's robustness, and Boundary Match Constraint (BMC) helps to locate rationales more precisely with the guidance of boundary information.
Performances on benchmark datasets demonstrate that the proposed AT-BMC outperforms baselines on both classification and rationale extraction by a large margin.
arXiv Detail & Related papers (2021-12-20T09:48:32Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Bridged Adversarial Training [6.925055322530057]
We show that adversarially trained models might have significantly different characteristics in terms of margin and smoothness, even they show similar robustness.
Inspired by the observation, we investigate the effect of different regularizers and discover the negative effect of the smoothness regularizer on maximizing the margin.
We propose a new method called bridged adversarial training that mitigates the negative effect by bridging the gap between clean and adversarial examples.
arXiv Detail & Related papers (2021-08-25T09:11:59Z) - Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs)
We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics.
Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z) - Blind Adversarial Pruning: Balance Accuracy, Efficiency and Robustness [3.039568795810294]
This paper first investigates the robustness of pruned models with different compression ratios under the gradual pruning process.
We then test the performance of mixing the clean data and adversarial examples into the gradual pruning process, called adversarial pruning.
To better balance the AER, we propose an approach called blind adversarial pruning (BAP), which introduces the idea of blind adversarial training into the gradual pruning process.
arXiv Detail & Related papers (2020-04-10T02:27:48Z) - An Investigation into the Stochasticity of Batch Whitening [95.54842420166862]
This paper investigates the more general Batch Whitening (BW) operation.
We show that while various whitening transformations equivalently improve the conditioning, they show significantly different behaviors in discriminative scenarios and training Generative Adrial Networks (GAN)
Our proposed BW algorithm improves the residual networks by a significant margin on ImageNetversaity.
arXiv Detail & Related papers (2020-03-27T11:06:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.