Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations
- URL: http://arxiv.org/abs/2011.07633v2
- Date: Fri, 3 Jun 2022 20:02:37 GMT
- Title: Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations
- Authors: Jinyuan Jia, Binghui Wang, Xiaoyu Cao, Hongbin Liu, Neil Zhenqiang
Gong
- Abstract summary: Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches.
Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
- Score: 78.23408201652984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Top-k predictions are used in many real-world applications such as machine
learning as a service, recommender systems, and web searches. $\ell_0$-norm
adversarial perturbation characterizes an attack that arbitrarily modifies some
features of an input such that a classifier makes an incorrect prediction for
the perturbed input. $\ell_0$-norm adversarial perturbation is easy to
interpret and can be implemented in the physical world. Therefore, certifying
robustness of top-$k$ predictions against $\ell_0$-norm adversarial
perturbation is important. However, existing studies either focused on
certifying $\ell_0$-norm robustness of top-$1$ predictions or $\ell_2$-norm
robustness of top-$k$ predictions. In this work, we aim to bridge the gap. Our
approach is based on randomized smoothing, which builds a provably robust
classifier from an arbitrary classifier via randomizing an input. Our major
theoretical contribution is an almost tight $\ell_0$-norm certified robustness
guarantee for top-$k$ predictions. We empirically evaluate our method on
CIFAR10 and ImageNet. For instance, our method can build a classifier that
achieves a certified top-3 accuracy of 69.2\% on ImageNet when an attacker can
arbitrarily perturb 5 pixels of a testing image.
Related papers
- Verifiably Robust Conformal Prediction [1.391198481393699]
This paper introduces VRCP (Verifiably Robust Conformal Prediction), a new framework that leverages neural network verification methods to recover coverage guarantees under adversarial attacks.
Our method is the first to support perturbations bounded by arbitrary norms including $ell1$, $ell2$, and $ellinfty$, as well as regression tasks.
In every case, VRCP achieves above nominal coverage and yields significantly more efficient and informative prediction regions than the SotA.
arXiv Detail & Related papers (2024-05-29T09:50:43Z) - Robust width: A lightweight and certifiable adversarial defense [0.0]
Adversarial examples are intentionally constructed to cause the model to make incorrect predictions or classifications.
In this work, we study an adversarial defense based on the robust width property (RWP), which was recently introduced for compressed sensing.
We show that a specific input purification scheme based on the RWP gives theoretical robustness guarantees for images that are approximately sparse.
arXiv Detail & Related papers (2024-05-24T22:50:50Z) - Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation.
We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z) - Getting a-Round Guarantees: Floating-Point Attacks on Certified Robustness [19.380453459873298]
Adversarial examples pose a security risk as they can alter decisions of a machine learning classifier through slight input perturbations.
We show that these guarantees can be invalidated due to limitations of floating-point representation that cause rounding errors.
We show that the attack can be carried out against linear classifiers that have exact certifiable guarantees and against neural networks that have conservative certifications.
arXiv Detail & Related papers (2022-05-20T13:07:36Z) - On the robustness of randomized classifiers to adversarial examples [11.359085303200981]
We introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics.
We show that our results are applicable to a wide range of machine learning models under mild hypotheses.
All robust models we trained models can simultaneously achieve state-of-the-art accuracy.
arXiv Detail & Related papers (2021-02-22T10:16:58Z) - Uncertainty Sets for Image Classifiers using Conformal Prediction [112.54626392838163]
We present an algorithm that modifies any classifier to output a predictive set containing the true label with a user-specified probability, such as 90%.
The algorithm is simple and fast like Platt scaling, but provides a formal finite-sample coverage guarantee for every model and dataset.
Our method modifies an existing conformal prediction algorithm to give more stable predictive sets by regularizing the small scores of unlikely classes after Platt scaling.
arXiv Detail & Related papers (2020-09-29T17:58:04Z) - Certifying Confidence via Randomized Smoothing [151.67113334248464]
Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems.
Most smoothing methods do not give us any information about the confidence with which the underlying classifier makes a prediction.
We propose a method to generate certified radii for the prediction confidence of the smoothed classifier.
arXiv Detail & Related papers (2020-09-17T04:37:26Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z) - Black-Box Certification with Randomized Smoothing: A Functional
Optimization Based Framework [60.981406394238434]
We propose a general framework of adversarial certification with non-Gaussian noise and for more general types of attacks.
Our proposed methods achieve better certification results than previous works and provide a new perspective on randomized smoothing certification.
arXiv Detail & Related papers (2020-02-21T07:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.