Gradient-Free Adversarial Attacks for Bayesian Neural Networks
- URL: http://arxiv.org/abs/2012.12640v1
- Date: Wed, 23 Dec 2020 13:19:11 GMT
- Title: Gradient-Free Adversarial Attacks for Bayesian Neural Networks
- Authors: Matthew Yuan, Matthew Wicker, Luca Laurenti
- Abstract summary: adversarial examples underscore the importance of understanding the robustness of machine learning models.
In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs.
- Score: 9.797319790710713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The existence of adversarial examples underscores the importance of
understanding the robustness of machine learning models. Bayesian neural
networks (BNNs), due to their calibrated uncertainty, have been shown to posses
favorable adversarial robustness properties. However, when approximate Bayesian
inference methods are employed, the adversarial robustness of BNNs is still not
well understood. In this work, we employ gradient-free optimization methods in
order to find adversarial examples for BNNs. In particular, we consider genetic
algorithms, surrogate models, as well as zeroth order optimization methods and
adapt them to the goal of finding adversarial examples for BNNs. In an
empirical evaluation on the MNIST and Fashion MNIST datasets, we show that for
various approximate Bayesian inference methods the usage of gradient-free
algorithms can greatly improve the rate of finding adversarial examples
compared to state-of-the-art gradient-based methods.
Related papers
- Evaluating the Robustness of Deep-Learning Algorithm-Selection Models by Evolving Adversarial Instances [0.16874375111244325]
Deep convolutional networks (DNN) are increasingly being used to perform algorithm-selection in neural domains.
adversarial samples are successfully generated from up to 56% of the original instances depending on the dataset.
We use an evolutionary algorithm (EA) to find perturbations of instances from two existing benchmarks for online bin packing that cause trained DRNs to misclassify.
arXiv Detail & Related papers (2024-06-24T12:48:44Z) - Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization [8.323690755070123]
This paper explores methods for verifying the properties of Binary Neural Networks (BNNs)
BNNs, like their full-precision counterparts, are also sensitive to input perturbations.
We introduce an alternative approach using Semidefinite Programming relaxations derived from sparse Polynomial Optimization.
arXiv Detail & Related papers (2024-05-27T11:03:48Z) - Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples.
A collapsed sample represents uncountably many models drawn from the approximate posterior.
Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z) - Improved and Interpretable Defense to Transferred Adversarial Examples
by Jacobian Norm with Selective Input Gradient Regularization [31.516568778193157]
Adversarial training (AT) is often adopted to improve the robustness of deep neural networks (DNNs)
In this work, we propose an approach based on Jacobian norm and Selective Input Gradient Regularization (J- SIGR)
Experiments demonstrate that the proposed J- SIGR confers improved robustness against transferred adversarial attacks, and we also show that the predictions from the neural network are easy to interpret.
arXiv Detail & Related papers (2022-07-09T01:06:41Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Towards Trustworthy Predictions from Deep Neural Networks with Fast
Adversarial Calibration [2.8935588665357077]
We propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift.
We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions.
arXiv Detail & Related papers (2020-12-20T13:39:29Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.