Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks
- URL: http://arxiv.org/abs/2404.19640v1
- Date: Sat, 27 Apr 2024 01:34:46 GMT
- Title: Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks
- Authors: Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe,
- Abstract summary: We investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines.
We find that BNNs trained with state-of-the-art approximate inference methods, and even BNNs trained with Hamiltonian Monte Carlo, are highly susceptible to adversarial attacks.
- Score: 10.317475068017961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines using even relatively unsophisticated attacks for three tasks: (1) label prediction under the posterior predictive mean, (2) adversarial example detection with Bayesian predictive uncertainty, and (3) semantic shift detection. We find that BNNs trained with state-of-the-art approximate inference methods, and even BNNs trained with Hamiltonian Monte Carlo, are highly susceptible to adversarial attacks. We also identify various conceptual and experimental errors in previous works that claimed inherent adversarial robustness of BNNs and conclusively demonstrate that BNNs and uncertainty-aware Bayesian prediction pipelines are not inherently robust against adversarial attacks.
Related papers
- ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural
Networks [22.497327185841232]
Network binarization exhibits great potential for deployment on resource-constrained devices due to its low computational cost.
Despite the critical importance, the security of binarized neural networks (BNNs) is rarely investigated.
We present ARBiBench, a comprehensive benchmark to evaluate the robustness of BNNs against adversarial perturbations.
arXiv Detail & Related papers (2023-12-21T04:48:34Z) - On the Robustness of Bayesian Neural Networks to Adversarial Attacks [11.277163381331137]
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications.
We show that vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution.
We prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks.
arXiv Detail & Related papers (2022-07-13T12:27:38Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - Robustness of Bayesian Neural Networks to White-Box Adversarial Attacks [55.531896312724555]
Bayesian Networks (BNNs) are robust and adept at handling adversarial attacks by incorporating randomness.
We create our BNN model, called BNN-DenseNet, by fusing Bayesian inference (i.e., variational Bayes) to the DenseNet architecture.
An adversarially-trained BNN outperforms its non-Bayesian, adversarially-trained counterpart in most experiments.
arXiv Detail & Related papers (2021-11-16T16:14:44Z) - Exploring Architectural Ingredients of Adversarially Robust Deep Neural
Networks [98.21130211336964]
Deep neural networks (DNNs) are known to be vulnerable to adversarial attacks.
In this paper, we investigate the impact of network width and depth on the robustness of adversarially trained DNNs.
arXiv Detail & Related papers (2021-10-07T23:13:33Z) - Resilience of Bayesian Layer-Wise Explanations under Adversarial Attacks [3.222802562733787]
We show that for deterministic Neural Networks, saliency interpretations are remarkably brittle even when the attacks fail.
We suggest and demonstrate empirically that saliency explanations provided by Bayesian Neural Networks are considerably more stable under adversarial perturbations.
arXiv Detail & Related papers (2021-02-22T14:07:24Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Proper Network Interpretability Helps Adversarial Robustness in
Classification [91.39031895064223]
We show that with a proper measurement of interpretation, it is difficult to prevent prediction-evasion adversarial attacks from causing interpretation discrepancy.
We develop an interpretability-aware defensive scheme built only on promoting robust interpretation.
We show that our defense achieves both robust classification and robust interpretation, outperforming state-of-the-art adversarial training methods against attacks of large perturbation.
arXiv Detail & Related papers (2020-06-26T01:31:31Z) - Robustness of Bayesian Neural Networks to Gradient-Based Attacks [9.966113038850946]
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications.
We show that vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution.
We demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks.
arXiv Detail & Related papers (2020-02-11T13:03:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.