RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers
via Randomized Deletion
- URL: http://arxiv.org/abs/2302.01757v3
- Date: Wed, 24 Jan 2024 23:58:13 GMT
- Title: RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers
via Randomized Deletion
- Authors: Zhuoqun Huang, Neil G. Marchant, Keane Lucas, Lujo Bauer, Olga
Ohrimenko and Benjamin I. P. Rubinstein
- Abstract summary: We adapt randomized smoothing for discrete sequence classifiers to provide certified robustness against edit distance-bounded adversaries.
Our proof of certification deviates from the established Neyman-Pearson approach, which is intractable in our setting, and is instead organized around longest common subsequences.
When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
- Score: 23.309600117618025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomized smoothing is a leading approach for constructing classifiers that
are certifiably robust against adversarial examples. Existing work on
randomized smoothing has focused on classifiers with continuous inputs, such as
images, where $\ell_p$-norm bounded adversaries are commonly studied. However,
there has been limited work for classifiers with discrete or variable-size
inputs, such as for source code, which require different threat models and
smoothing mechanisms. In this work, we adapt randomized smoothing for discrete
sequence classifiers to provide certified robustness against edit
distance-bounded adversaries. Our proposed smoothing mechanism randomized
deletion (RS-Del) applies random deletion edits, which are (perhaps
surprisingly) sufficient to confer robustness against adversarial deletion,
insertion and substitution edits. Our proof of certification deviates from the
established Neyman-Pearson approach, which is intractable in our setting, and
is instead organized around longest common subsequences. We present a case
study on malware detection--a binary classification problem on byte sequences
where classifier evasion is a well-established threat model. When applied to
the popular MalConv malware detection model, our smoothing mechanism RS-Del
achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
Related papers
- A Robust Defense against Adversarial Attacks on Deep Learning-based
Malware Detectors via (De)Randomized Smoothing [4.97719149179179]
We propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing.
In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes.
arXiv Detail & Related papers (2024-02-23T11:30:12Z) - The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks.
We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs.
Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z) - Confidence-aware Training of Smoothed Classifiers for Certified
Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input.
Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z) - SmoothMix: Training Confidence-calibrated Smoothed Classifiers for
Certified Robustness [61.212486108346695]
We propose a training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup.
The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness.
Our experimental results demonstrate that the proposed method can significantly improve the certified $ell$-robustness of smoothed classifiers.
arXiv Detail & Related papers (2021-11-17T18:20:59Z) - Improved, Deterministic Smoothing for L1 Certified Robustness [119.86676998327864]
We propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN)
In contrast to uniform additive smoothing, the SSN certification does not require the random noise components used to be independent.
This is the first work to provide deterministic "randomized smoothing" for a norm-based adversarial threat model.
arXiv Detail & Related papers (2021-03-17T21:49:53Z) - Adversarially Robust Classifier with Covariate Shift Adaptation [25.39995678746662]
Existing adversarially trained models typically perform inference on test examples independently from each other.
We show that simple adaptive batch normalization (BN) technique can significantly improve the robustness of these models for any random perturbations.
We further demonstrate that adaptive BN technique significantly improves robustness against common corruptions, while often enhancing performance against adversarial attacks.
arXiv Detail & Related papers (2021-02-09T19:51:56Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z) - Robustness Verification for Classifier Ensembles [3.5884936187733394]
robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack.
We show the NP-hardness of the problem and provide an upper bound on the number of attacks that is sufficient to form an optimal randomized attack.
Our prototype implementation verifies multiple neural-network ensembles trained for image-classification tasks.
arXiv Detail & Related papers (2020-05-12T07:38:43Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.