Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
- URL: http://arxiv.org/abs/2302.04379v4
- Date: Wed, 12 Jun 2024 00:52:09 GMT
- Title: Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
- Authors: Andrew C. Cullen, Shijie Liu, Paul Montague, Sarah M. Erfani, Benjamin I. P. Rubinstein,
- Abstract summary: Our new emphCertification Aware Attack exploits certifications to produce computationally efficient norm-minimising adversarial examples.
While these attacks can be used to assess the tightness of certification bounds, they also highlight that releasing certifications can paradoxically reduce security.
- Score: 30.42301446202426
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In guaranteeing the absence of adversarial examples in an instance's neighbourhood, certification mechanisms play an important role in demonstrating neural net robustness. In this paper, we ask if these certifications can compromise the very models they help to protect? Our new \emph{Certification Aware Attack} exploits certifications to produce computationally efficient norm-minimising adversarial examples $74 \%$ more often than comparable attacks, while reducing the median perturbation norm by more than $10\%$. While these attacks can be used to assess the tightness of certification bounds, they also highlight that releasing certifications can paradoxically reduce security.
Related papers
- FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks [62.897993591443594]
FullCert is the first end-to-end certifier with sound, deterministic bounds.
We experimentally demonstrate FullCert's feasibility on two datasets.
arXiv Detail & Related papers (2024-06-17T13:23:52Z) - CrossCert: A Cross-Checking Detection Approach to Patch Robustness Certification for Deep Learning Models [6.129515045488372]
Patch robustness certification is an emerging kind of defense technique against adversarial patch attacks with provable guarantees.
This paper proposes a novel certified defense technique called CrossCert.
arXiv Detail & Related papers (2024-05-13T11:54:03Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - It's Simplex! Disaggregating Measures to Improve Certified Robustness [32.63920797751968]
This work presents two approaches to improve the analysis of certification mechanisms.
New certification approaches have the potential to more than double the achievable radius of certification.
Empirical evaluation verifies that our new approach can certify $9%$ more samples at noise scale $sigma = 1$.
arXiv Detail & Related papers (2023-09-20T02:16:19Z) - Enhancing the Antidote: Improved Pointwise Certifications against Poisoning Attacks [30.42301446202426]
Poisoning attacks can disproportionately influence model behaviour by making small changes to the training corpus.
We make it possible to provide guarantees of the robustness of a sample against adversarial attacks modifying a finite number of training samples.
arXiv Detail & Related papers (2023-08-15T03:46:41Z) - Double Bubble, Toil and Trouble: Enhancing Certified Robustness through
Transitivity [27.04033198073254]
In response to subtle adversarial examples flipping classifications of neural network models, recent research has promoted certified robustness as a solution.
We show how today's "optimal" certificates can be improved by exploiting both the transitivity of certifications, and the geometry of the input space.
Our technique shows even more promising results, with a uniform $4$ percentage point increase in the achieved certified radius.
arXiv Detail & Related papers (2022-10-12T10:42:21Z) - COPA: Certifying Robust Policies for Offline Reinforcement Learning
against Poisoning Attacks [49.15885037760725]
We focus on certifying the robustness of offline reinforcement learning (RL) in the presence of poisoning attacks.
We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated.
We prove that some of the proposed certification methods are theoretically tight and some are NP-Complete problems.
arXiv Detail & Related papers (2022-03-16T05:02:47Z) - Fast Training of Provably Robust Neural Networks by SingleProp [71.19423596238568]
We develop a new regularizer that is both more efficient than existing certified defenses.
We demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.
arXiv Detail & Related papers (2021-02-01T22:12:51Z) - Breaking certified defenses: Semantic adversarial examples with spoofed
robustness certificates [57.52763961195292]
We present a new attack that exploits not only the labelling function of a classifier, but also the certificate generator.
The proposed method applies large perturbations that place images far from a class boundary while maintaining the imperceptibility property of adversarial examples.
arXiv Detail & Related papers (2020-03-19T17:59:44Z) - (De)Randomized Smoothing for Certifiable Defense against Patch Attacks [136.79415677706612]
We introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size.
Our method is related to the broad class of randomized smoothing robustness schemes.
Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-02-25T08:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.