Robustness Verification for Classifier Ensembles
- URL: http://arxiv.org/abs/2005.05587v2
- Date: Thu, 9 Jul 2020 07:43:16 GMT
- Title: Robustness Verification for Classifier Ensembles
- Authors: Dennis Gross, Nils Jansen, Guillermo A. P\'erez, Stephan Raaijmakers
- Abstract summary: robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack.
We show the NP-hardness of the problem and provide an upper bound on the number of attacks that is sufficient to form an optimal randomized attack.
Our prototype implementation verifies multiple neural-network ensembles trained for image-classification tasks.
- Score: 3.5884936187733394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give a formal verification procedure that decides whether a classifier
ensemble is robust against arbitrary randomized attacks. Such attacks consist
of a set of deterministic attacks and a distribution over this set. The
robustness-checking problem consists of assessing, given a set of classifiers
and a labelled data set, whether there exists a randomized attack that induces
a certain expected loss against all classifiers. We show the NP-hardness of the
problem and provide an upper bound on the number of attacks that is sufficient
to form an optimal randomized attack. These results provide an effective way to
reason about the robustness of a classifier ensemble. We provide SMT and MILP
encodings to compute optimal randomized attacks or prove that there is no
attack inducing a certain expected loss. In the latter case, the classifier
ensemble is provably robust. Our prototype implementation verifies multiple
neural-network ensembles trained for image-classification tasks. The
experimental results using the MILP encoding are promising both in terms of
scalability and the general applicability of our verification procedure.
Related papers
- Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - On the Role of Randomization in Adversarially Robust Classification [13.39932522722395]
We show that a randomized ensemble outperforms the hypothesis set in adversarial risk.
We also give an explicit description of the deterministic hypothesis set that contains such a deterministic classifier.
arXiv Detail & Related papers (2023-02-14T17:51:00Z) - RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers
via Randomized Deletion [23.309600117618025]
We adapt randomized smoothing for discrete sequence classifiers to provide certified robustness against edit distance-bounded adversaries.
Our proof of certification deviates from the established Neyman-Pearson approach, which is intractable in our setting, and is instead organized around longest common subsequences.
When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
arXiv Detail & Related papers (2023-01-31T01:40:26Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function [13.417003144007156]
adversarial attacks tend to rely on the principle of transferability.
Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers.
Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
arXiv Detail & Related papers (2021-12-09T14:26:13Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - PopSkipJump: Decision-Based Attack for Probabilistic Classifiers [43.62922682676909]
P(robabilisticH)opSkipJump adapts its amount of queries to maintain HopSkipJump's original output quality across various noise levels.
We show that off-the-shelf randomized defenses offer almost no extra robustness to decision-based attacks.
arXiv Detail & Related papers (2021-06-14T14:13:12Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.