Related papers: Towards Certifiable Adversarial Sample Detection

Towards Certifiable Adversarial Sample Detection

URL: http://arxiv.org/abs/2002.08740v1
Date: Thu, 20 Feb 2020 14:10:00 GMT
Title: Towards Certifiable Adversarial Sample Detection
Authors: Ilia Shumailov, Yiren Zhao, Robert Mullins, Ross Anderson
Abstract summary: Convolutional Neural Networks (CNNs) are deployed in more and more classification systems. adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. We provide a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT) We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.
Score: 9.29621938988162
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or other limitations. In this paper, we provide a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). The system can provide certifiable guarantees of detection of adversarial inputs for certain $l_{\infty}$ sizes on a reasonable assumption, namely that the training data have the same distribution as the test data. We develop and evaluate several versions of CTT with a range of defense capabilities, training overheads and certifiability on adversarial samples. Against adversaries with various $l_p$ norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.

Related papers

DAD++: Improved Data-free Test Time Adversarial Defense [12.606555446261668]
We propose a test time Data-free Adversarial Defense (DAD) containing detection and correction frameworks. We conduct a wide range of experiments and ablations on several datasets and network architectures to show the efficacy of our proposed approach. Our DAD++ gives an impressive performance against various adversarial attacks with a minimal drop in clean accuracy.
arXiv Detail & Related papers (2023-09-10T20:39:53Z)
Uncovering Adversarial Risks of Test-Time Adaptation [41.19226800089764]
Test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. We uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by malicious samples in the same batch. We propose Distribution Invading Attack (DIA), which injects a small fraction of malicious data into the test batch.
arXiv Detail & Related papers (2023-01-29T22:58:05Z)
DE-CROP: Data-efficient Certified Robustness for Pretrained Classifiers [21.741026088202126]
We propose a novel way to certify the robustness of pretrained models using only a few training samples. Our proposed approach generates class-boundary and interpolated samples corresponding to each training sample. We obtain significant improvements over the baseline on multiple benchmark datasets and also report similar performance under the challenging black box setup.
arXiv Detail & Related papers (2022-10-17T10:41:18Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks [49.15885037760725]
We focus on certifying the robustness of offline reinforcement learning (RL) in the presence of poisoning attacks. We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated. We prove that some of the proposed certification methods are theoretically tight and some are NP-Complete problems.
arXiv Detail & Related papers (2022-03-16T05:02:47Z)
Sparsity Winning Twice: Better Robust Generalization from More Efficient Training [94.92954973680914]
We introduce two alternatives for sparse adversarial training: (i) static sparsity and (ii) dynamic sparsity. We find both methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting. Our approaches can be combined with existing regularizers, establishing new state-of-the-art results in adversarial training.
arXiv Detail & Related papers (2022-02-20T15:52:08Z)
Adversarial Attacks and Defense for Non-Parametric Two-Sample Tests [73.32304304788838]
This paper systematically uncovers the failure mode of non-parametric TSTs through adversarial attacks. To enable TST-agnostic attacks, we propose an ensemble attack framework that jointly minimizes the different types of test criteria. To robustify TSTs, we propose a max-min optimization that iteratively generates adversarial pairs to train the deep kernels.
arXiv Detail & Related papers (2022-02-07T11:18:04Z)
Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
Improving the Certified Robustness of Neural Networks via Consistency Regularization [25.42238710803711]
A range of defense methods have been proposed to improve the robustness of neural networks on adversarial examples. Most of these provable defense methods treat all examples equally during training process. In this paper, we explore this inconsistency caused by misclassified examples and add a novel consistency regularization term to make better use of the misclassified examples.
arXiv Detail & Related papers (2020-12-24T05:00:50Z)
Certified Distributional Robustness on Smoothed Classifiers [27.006844966157317]
We propose the worst-case adversarial loss over input distributions as a robustness certificate. By exploiting duality and the smoothness property, we provide an easy-to-compute upper bound as a surrogate for the certificate.
arXiv Detail & Related papers (2020-10-21T13:22:25Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.