Related papers: Certifying Confidence via Randomized Smoothing

Certifying Confidence via Randomized Smoothing

URL: http://arxiv.org/abs/2009.08061v2
Date: Thu, 22 Oct 2020 21:15:50 GMT
Title: Certifying Confidence via Randomized Smoothing
Authors: Aounon Kumar, Alexander Levine, Soheil Feizi, Tom Goldstein
Abstract summary: Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. Most smoothing methods do not give us any information about the confidence with which the underlying classifier makes a prediction. We propose a method to generate certified radii for the prediction confidence of the smoothed classifier.
Score: 151.67113334248464
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. It uses the probabilities of predicting the top two most-likely classes around an input point under a smoothing distribution to generate a certified radius for a classifier's prediction. However, most smoothing methods do not give us any information about the confidence with which the underlying classifier (e.g., deep neural network) makes a prediction. In this work, we propose a method to generate certified radii for the prediction confidence of the smoothed classifier. We consider two notions for quantifying confidence: average prediction score of a class and the margin by which the average prediction score of one class exceeds that of another. We modify the Neyman-Pearson lemma (a key theorem in randomized smoothing) to design a procedure for computing the certified radius where the confidence is guaranteed to stay above a certain threshold. Our experimental results on CIFAR-10 and ImageNet datasets show that using information about the distribution of the confidence scores allows us to achieve a significantly better certified radius than ignoring it. Thus, we demonstrate that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier. Code for the experiments is available at https://github.com/aounon/cdf-smoothing.

Related papers

Pixel-level Certified Explanations via Randomized Smoothing [87.48628403354351]
Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels.<n>Small, imperceptible input perturbations can drastically alter the attribution map while maintaining the same prediction.<n>We introduce the first certification framework that guarantees pixel-level robustness for any black-box attribution method.
arXiv Detail & Related papers (2025-06-18T14:41:24Z)
The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z)
How to Fix a Broken Confidence Estimator: Evaluating Post-hoc Methods for Selective Classification with Deep Neural Networks [1.4502611532302039]
We show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance. Our results are shown to be consistent under distribution shift.
arXiv Detail & Related papers (2023-05-24T18:56:55Z)
Confidence-aware Training of Smoothed Classifiers for Certified Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z)
Getting a-Round Guarantees: Floating-Point Attacks on Certified Robustness [19.380453459873298]
Adversarial examples pose a security risk as they can alter decisions of a machine learning classifier through slight input perturbations. We show that these guarantees can be invalidated due to limitations of floating-point representation that cause rounding errors. We show that the attack can be carried out against linear classifiers that have exact certifiable guarantees and against neural networks that have conservative certifications.
arXiv Detail & Related papers (2022-05-20T13:07:36Z)
Smooth-Reduce: Leveraging Patches for Improved Certified Robustness [100.28947222215463]
We propose a training-free, modified smoothing approach, Smooth-Reduce. Our algorithm classifies overlapping patches extracted from an input image, and aggregates the predicted logits to certify a larger radius around the input. We provide theoretical guarantees for such certificates, and empirically show significant improvements over other randomized smoothing methods.
arXiv Detail & Related papers (2022-05-12T15:26:20Z)
Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches. Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input. For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z)
Knowing what you know: valid and validated confidence sets in multiclass and multilabel prediction [0.8594140167290097]
We develop conformal prediction methods for constructing valid confidence sets in multiclass and multilabel problems. By leveraging ideas from quantile regression, we build methods that always guarantee correct coverage but additionally provide conditional coverage for both multiclass and multilabel prediction problems.
arXiv Detail & Related papers (2020-04-21T17:45:38Z)
Binary Classification from Positive Data with Skewed Confidence [85.18941440826309]
Positive-confidence (Pconf) classification is a promising weakly-supervised learning method. In practice, the confidence may be skewed by bias arising in an annotation process. We introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyper parameter.
arXiv Detail & Related papers (2020-01-29T00:04:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.