On the robustness of randomized classifiers to adversarial examples
- URL: http://arxiv.org/abs/2102.10875v1
- Date: Mon, 22 Feb 2021 10:16:58 GMT
- Title: On the robustness of randomized classifiers to adversarial examples
- Authors: Rafael Pinot and Laurent Meunier and Florian Yger and C\'edric
Gouy-Pailler and Yann Chevaleyre and Jamal Atif
- Abstract summary: We introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics.
We show that our results are applicable to a wide range of machine learning models under mild hypotheses.
All robust models we trained models can simultaneously achieve state-of-the-art accuracy.
- Score: 11.359085303200981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the theory of robustness against adversarial attacks.
We focus on randomized classifiers (\emph{i.e.} classifiers that output random
variables) and provide a thorough analysis of their behavior through the lens
of statistical learning theory and information theory. To this aim, we
introduce a new notion of robustness for randomized classifiers, enforcing
local Lipschitzness using probability metrics. Equipped with this definition,
we make two new contributions. The first one consists in devising a new upper
bound on the adversarial generalization gap of randomized classifiers. More
precisely, we devise bounds on the generalization gap and the adversarial gap
(\emph{i.e.} the gap between the risk and the worst-case risk under attack) of
randomized classifiers. The second contribution presents a yet simple but
efficient noise injection method to design robust randomized classifiers. We
show that our results are applicable to a wide range of machine learning models
under mild hypotheses. We further corroborate our findings with experimental
results using deep neural networks on standard image datasets, namely CIFAR-10
and CIFAR-100. All robust models we trained models can simultaneously achieve
state-of-the-art accuracy (over $0.82$ clean accuracy on CIFAR-10) and enjoy
\emph{guaranteed} robust accuracy bounds ($0.45$ against $\ell_2$ adversaries
with magnitude $0.5$ on CIFAR-10).
Related papers
- The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks.
We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs.
Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Adversarially Robust Classifier with Covariate Shift Adaptation [25.39995678746662]
Existing adversarially trained models typically perform inference on test examples independently from each other.
We show that simple adaptive batch normalization (BN) technique can significantly improve the robustness of these models for any random perturbations.
We further demonstrate that adaptive BN technique significantly improves robustness against common corruptions, while often enhancing performance against adversarial attacks.
arXiv Detail & Related papers (2021-02-09T19:51:56Z) - Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches.
Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z) - Hidden Cost of Randomized Smoothing [72.93630656906599]
In this paper, we point out the side effects of current randomized smoothing.
Specifically, we articulate and prove two major points: 1) the decision boundaries of smoothed classifiers will shrink, resulting in disparity in class-wise accuracy; 2) applying noise augmentation in the training process does not necessarily resolve the shrinking issue due to the inconsistent learning objectives.
arXiv Detail & Related papers (2020-03-02T23:37:42Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.