Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
- URL: http://arxiv.org/abs/2002.03018v4
- Date: Tue, 11 Aug 2020 13:17:30 GMT
- Title: Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
- Authors: Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter
- Abstract summary: Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
- Score: 105.91827623768724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning algorithms are known to be susceptible to data poisoning
attacks, where an adversary manipulates the training data to degrade
performance of the resulting classifier. In this work, we present a unifying
view of randomized smoothing over arbitrary functions, and we leverage this
novel characterization to propose a new strategy for building classifiers that
are pointwise-certifiably robust to general data poisoning attacks. As a
specific instantiation, we utilize our framework to build linear classifiers
that are robust to a strong variant of label flipping, where each test example
is targeted independently. In other words, for each test point, our classifier
includes a certification that its prediction would be the same had some number
of training labels been changed adversarially. Randomized smoothing has
previously been used to guarantee---with high probability---test-time
robustness to adversarial manipulation of the input to a classifier; we derive
a variant which provides a deterministic, analytical bound, sidestepping the
probabilistic certificates that traditionally result from the sampling
subprocedure. Further, we obtain these certified bounds with minimal additional
runtime complexity over standard classification and no assumptions on the train
or test distributions. We generalize our results to the multi-class case,
providing the first multi-class classification algorithm that is certifiably
robust to label-flipping attacks.
Related papers
- Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - Understanding Noise-Augmented Training for Randomized Smoothing [14.061680807550722]
Randomized smoothing is a technique for providing provable robustness guarantees against adversarial attacks.
We show that, without making stronger distributional assumptions, no benefit can be expected from predictors trained with noise-augmentation.
Our analysis has direct implications to the practical deployment of randomized smoothing.
arXiv Detail & Related papers (2023-05-08T14:46:34Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Beyond cross-entropy: learning highly separable feature distributions
for robust and accurate classification [22.806324361016863]
We propose a novel approach for training deep robust multiclass classifiers that provides adversarial robustness.
We show that the regularization of the latent space based on our approach yields excellent classification accuracy.
arXiv Detail & Related papers (2020-10-29T11:15:17Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Robustness Verification for Classifier Ensembles [3.5884936187733394]
robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack.
We show the NP-hardness of the problem and provide an upper bound on the number of attacks that is sufficient to form an optimal randomized attack.
Our prototype implementation verifies multiple neural-network ensembles trained for image-classification tasks.
arXiv Detail & Related papers (2020-05-12T07:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.