On the Robustness of Bayesian Neural Networks to Adversarial Attacks
- URL: http://arxiv.org/abs/2207.06154v3
- Date: Wed, 28 Feb 2024 08:20:03 GMT
- Title: On the Robustness of Bayesian Neural Networks to Adversarial Attacks
- Authors: Luca Bortolussi, Ginevra Carbone, Luca Laurenti, Andrea Patane, Guido
Sanguinetti, Matthew Wicker
- Abstract summary: Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications.
We show that vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution.
We prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks.
- Score: 11.277163381331137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vulnerability to adversarial attacks is one of the principal hurdles to the
adoption of deep learning in safety-critical applications. Despite significant
efforts, both practical and theoretical, training deep learning models robust
to adversarial attacks is still an open problem. In this paper, we analyse the
geometry of adversarial attacks in the large-data, overparameterized limit for
Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to
gradient-based attacks arises as a result of degeneracy in the data
distribution, i.e., when the data lies on a lower-dimensional submanifold of
the ambient space. As a direct consequence, we demonstrate that in this limit
BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we
prove that the expected gradient of the loss with respect to the BNN posterior
distribution is vanishing, even when each neural network sampled from the
posterior is vulnerable to gradient-based attacks. Experimental results on the
MNIST, Fashion MNIST, and half moons datasets, representing the finite data
regime, with BNNs trained with Hamiltonian Monte Carlo and Variational
Inference, support this line of arguments, showing that BNNs can display both
high accuracy on clean data and robustness to both gradient-based and
gradient-free based adversarial attacks.
Related papers
- Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks [50.87615167799367]
We certify Graph Neural Networks (GNNs) against poisoning attacks, including backdoors, targeting the node features of a given graph.
Our framework provides fundamental insights into the role of graph structure and its connectivity on the worst-case behavior of convolution-based and PageRank-based GNNs.
arXiv Detail & Related papers (2024-07-15T16:12:51Z) - DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications.
BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks.
We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z) - Not So Robust After All: Evaluating the Robustness of Deep Neural
Networks to Unseen Adversarial Attacks [5.024667090792856]
Deep neural networks (DNNs) have gained prominence in various applications, such as classification, recognition, and prediction.
A fundamental attribute of traditional DNNs is their vulnerability to modifications in input data, which has resulted in the investigation of adversarial attacks.
This study aims to challenge the efficacy and generalization of contemporary defense mechanisms against adversarial attacks.
arXiv Detail & Related papers (2023-08-12T05:21:34Z) - What Does the Gradient Tell When Attacking the Graph Structure [44.44204591087092]
We present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs.
By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous.
We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility.
arXiv Detail & Related papers (2022-08-26T15:45:20Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Mitigating the Impact of Adversarial Attacks in Very Deep Networks [10.555822166916705]
Deep Neural Network (DNN) models have vulnerabilities related to security concerns.
Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models.
We propose an attack-agnostic-based defense method for mitigating their influence.
arXiv Detail & Related papers (2020-12-08T21:25:44Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z) - Towards More Practical Adversarial Attacks on Graph Neural Networks [14.78539966828287]
We study the black-box attacks on graph neural networks (GNNs) under a novel and realistic constraint.
We show that the structural inductive biases of GNN models can be an effective source for this type of attacks.
arXiv Detail & Related papers (2020-06-09T05:27:39Z) - Adversarial Attacks and Defenses on Graphs: A Review, A Tool and
Empirical Studies [73.39668293190019]
Adversary attacks can be easily fooled by small perturbation on the input.
Graph Neural Networks (GNNs) have been demonstrated to inherit this vulnerability.
In this survey, we categorize existing attacks and defenses, and review the corresponding state-of-the-art methods.
arXiv Detail & Related papers (2020-03-02T04:32:38Z) - Robustness of Bayesian Neural Networks to Gradient-Based Attacks [9.966113038850946]
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications.
We show that vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution.
We demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks.
arXiv Detail & Related papers (2020-02-11T13:03:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.