RoMA: a Method for Neural Network Robustness Measurement and Assessment
- URL: http://arxiv.org/abs/2110.11088v2
- Date: Fri, 22 Oct 2021 08:34:55 GMT
- Title: RoMA: a Method for Neural Network Robustness Measurement and Assessment
- Authors: Natan Levy and Guy Katz
- Abstract summary: We present a new statistical method, called Robustness Measurement and Assessment (RoMA)
RoMA determines the probability that a random input perturbation might cause misclassification.
One interesting insight obtained through this work is that, in a classification network, different output labels can exhibit very different robustness levels.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network models have become the leading solution for a large variety of
tasks, such as classification, language processing, protein folding, and
others. However, their reliability is heavily plagued by adversarial inputs:
small input perturbations that cause the model to produce erroneous outputs.
Adversarial inputs can occur naturally when the system's environment behaves
randomly, even in the absence of a malicious adversary, and are a severe cause
for concern when attempting to deploy neural networks within critical systems.
In this paper, we present a new statistical method, called Robustness
Measurement and Assessment (RoMA), which can measure the expected robustness of
a neural network model. Specifically, RoMA determines the probability that a
random input perturbation might cause misclassification. The method allows us
to provide formal guarantees regarding the expected frequency of errors that a
trained model will encounter after deployment. Our approach can be applied to
large-scale, black-box neural networks, which is a significant advantage
compared to recently proposed verification methods. We apply our approach in
two ways: comparing the robustness of different models, and measuring how a
model's robustness is affected by the magnitude of input perturbation. One
interesting insight obtained through this work is that, in a classification
network, different output labels can exhibit very different robustness levels.
We term this phenomenon categorial robustness. Our ability to perform risk and
robustness assessments on a categorial basis opens the door to risk mitigation,
which may prove to be a significant step towards neural network certification
in safety-critical applications.
Related papers
- How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data.
Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data.
We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z) - Towards Certified Probabilistic Robustness with High Accuracy [3.957941698534126]
Adrial examples pose a security threat to many critical systems built on neural networks.
How to build certifiably robust yet accurate neural network models remains an open problem.
We propose a novel approach that aims to achieve both high accuracy and certified probabilistic robustness.
arXiv Detail & Related papers (2023-09-02T09:39:47Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Generating Probabilistic Safety Guarantees for Neural Network
Controllers [30.34898838361206]
We use a dynamics model to determine the output properties that must hold for a neural network controller to operate safely.
We develop an adaptive verification approach to efficiently generate an overapproximation of the neural network policy.
We show that our method is able to generate meaningful probabilistic safety guarantees for aircraft collision avoidance neural networks.
arXiv Detail & Related papers (2021-03-01T18:48:21Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Ramifications of Approximate Posterior Inference for Bayesian Deep
Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks.
Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z) - Gradients as a Measure of Uncertainty in Neural Networks [16.80077149399317]
We propose to utilize backpropagated gradients to quantify the uncertainty of trained models.
We show that our gradient-based method outperforms state-of-the-art methods by up to 4.8% of AUROC score in out-of-distribution detection.
arXiv Detail & Related papers (2020-08-18T16:58:46Z) - Hidden Cost of Randomized Smoothing [72.93630656906599]
In this paper, we point out the side effects of current randomized smoothing.
Specifically, we articulate and prove two major points: 1) the decision boundaries of smoothed classifiers will shrink, resulting in disparity in class-wise accuracy; 2) applying noise augmentation in the training process does not necessarily resolve the shrinking issue due to the inconsistent learning objectives.
arXiv Detail & Related papers (2020-03-02T23:37:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.