A Primer on Multi-Neuron Relaxation-based Adversarial Robustness
Certification
- URL: http://arxiv.org/abs/2106.03099v1
- Date: Sun, 6 Jun 2021 11:59:27 GMT
- Title: A Primer on Multi-Neuron Relaxation-based Adversarial Robustness
Certification
- Authors: Kevin Roth
- Abstract summary: adversarial examples pose a real danger when deep neural networks are deployed in the real world.
We develop a unified mathematical framework to describe relaxation-based robustness certification methods.
- Score: 6.71471794387473
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The existence of adversarial examples poses a real danger when deep neural
networks are deployed in the real world. The go-to strategy to quantify this
vulnerability is to evaluate the model against specific attack algorithms. This
approach is however inherently limited, as it says little about the robustness
of the model against more powerful attacks not included in the evaluation. We
develop a unified mathematical framework to describe relaxation-based
robustness certification methods, which go beyond adversary-specific robustness
evaluation and instead provide provable robustness guarantees against attacks
by any adversary. We discuss the fundamental limitations posed by single-neuron
relaxations and show how the recent ``k-ReLU'' multi-neuron relaxation
framework of Singh et al. (2019) obtains tighter correlation-aware activation
bounds by leveraging additional relational constraints among groups of neurons.
Specifically, we show how additional pre-activation bounds can be mapped to
corresponding post-activation bounds and how they can in turn be used to obtain
tighter robustness certificates. We also present an intuitive way to visualize
different relaxation-based certification methods. By approximating multiple
non-linearities jointly instead of separately, the k-ReLU method is able to
bypass the convex barrier imposed by single neuron relaxations.
Related papers
- Mitigating Feature Gap for Adversarial Robustness by Feature
Disentanglement [61.048842737581865]
Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner.
We propose a disentanglement-based approach to explicitly model and remove the latent features that cause the feature gap.
Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.
arXiv Detail & Related papers (2024-01-26T08:38:57Z) - It begins with a boundary: A geometric view on probabilistically robust learning [6.877576704011329]
We take a fresh and geometric view on one such method -- Probabilistically Robust Learning (PRL)
We prove existence of solutions to the original and modified problems using novel relaxation methods.
We also clarify, through a suitable $Gamma$-convergence analysis, the way in which the original and modified PRL models interpolate between risk minimization and adversarial training.
arXiv Detail & Related papers (2023-05-30T06:24:30Z) - Feature Separation and Recalibration for Adversarial Robustness [18.975320671203132]
We propose a novel, easy-to- verify approach named Feature Separation and Recalibration.
It recalibrates the malicious, non-robust activations for more robust feature maps through Separation and Recalibration.
It improves the robustness of existing adversarial training methods by up to 8.57% with small computational overhead.
arXiv Detail & Related papers (2023-03-24T07:43:57Z) - On the Minimal Adversarial Perturbation for Deep Neural Networks with
Provable Estimation Error [65.51757376525798]
The existence of adversarial perturbations has opened an interesting research line on provable robustness.
No provable results have been presented to estimate and bound the error committed.
This paper proposes two lightweight strategies to find the minimal adversarial perturbation.
The obtained results show that the proposed strategies approximate the theoretical distance and robustness for samples close to the classification, leading to provable guarantees against any adversarial attacks.
arXiv Detail & Related papers (2022-01-04T16:40:03Z) - Pruning in the Face of Adversaries [0.0]
We evaluate the impact of neural network pruning on the adversarial robustness against L-0, L-2 and L-infinity attacks.
Our results confirm that neural network pruning and adversarial robustness are not mutually exclusive.
We extend our analysis to situations that incorporate additional assumptions on the adversarial scenario and show that depending on the situation, different strategies are optimal.
arXiv Detail & Related papers (2021-08-19T09:06:16Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - LiBRe: A Practical Bayesian Approach to Adversarial Detection [36.541671795530625]
LiBRe can endow a variety of pre-trained task-dependent DNNs with the ability of defending heterogeneous adversarial attacks at a low cost.
We build the few-layer deep ensemble variational and adopt the pre-training & fine-tuning workflow to boost the effectiveness and efficiency of LiBRe.
arXiv Detail & Related papers (2021-03-27T07:48:58Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Improving the Tightness of Convex Relaxation Bounds for Training
Certifiably Robust Classifiers [72.56180590447835]
Convex relaxations are effective for certifying training and neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.
We propose two experiments that can be used to train neural networks that can be trained in higher certified accuracy than non-regularized baselines.
arXiv Detail & Related papers (2020-02-22T20:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.