Related papers: On Using Certified Training towards Empirical Robustness

On Using Certified Training towards Empirical Robustness

URL: http://arxiv.org/abs/2410.01617v1
Date: Wed, 2 Oct 2024 14:56:21 GMT
Title: On Using Certified Training towards Empirical Robustness
Authors: Alessandro De Palma, Serge Durand, Zakaria Chihani, François Terrier, Caterina Urban,
Abstract summary: We show that a certified training algorithm can prevent catastrophic overfitting on single-step attacks. We also present a novel regularizer for network over-approximations that can achieve similar effects while markedly reducing runtime.
Score: 40.582830117229854
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training is arguably the most popular way to provide empirical robustness against specific adversarial examples. While variants based on multi-step attacks incur significant computational overhead, single-step variants are vulnerable to a failure mode known as catastrophic overfitting, which hinders their practical utility for large perturbations. A parallel line of work, certified training, has focused on producing networks amenable to formal guarantees of robustness against any possible attack. However, the wide gap between the best-performing empirical and certified defenses has severely limited the applicability of the latter. Inspired by recent developments in certified training, which rely on a combination of adversarial attacks with network over-approximations, and by the connections between local linearity and catastrophic overfitting, we present experimental evidence on the practical utility and limitations of using certified training towards empirical robustness. We show that, when tuned for the purpose, a recent certified training algorithm can prevent catastrophic overfitting on single-step attacks, and that it can bridge the gap to multi-step baselines under appropriate experimental settings. Finally, we present a novel regularizer for network over-approximations that can achieve similar effects while markedly reducing runtime.

Related papers

Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs. We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
LiBRe: A Practical Bayesian Approach to Adversarial Detection [36.541671795530625]
LiBRe can endow a variety of pre-trained task-dependent DNNs with the ability of defending heterogeneous adversarial attacks at a low cost. We build the few-layer deep ensemble variational and adopt the pre-training & fine-tuning workflow to boost the effectiveness and efficiency of LiBRe.
arXiv Detail & Related papers (2021-03-27T07:48:58Z)
Bayesian Inference with Certifiable Adversarial Robustness [25.40092314648194]
We consider adversarial training networks through the lens of Bayesian learning. We present a principled framework for adversarial training of Bayesian Neural Networks (BNNs) with certifiable guarantees. Our method is the first to directly train certifiable BNNs, thus facilitating their use in safety-critical applications.
arXiv Detail & Related papers (2021-02-10T07:17:49Z)
Fast Training of Provably Robust Neural Networks by SingleProp [71.19423596238568]
We develop a new regularizer that is both more efficient than existing certified defenses. We demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.
arXiv Detail & Related papers (2021-02-01T22:12:51Z)
Improving the Certified Robustness of Neural Networks via Consistency Regularization [25.42238710803711]
A range of defense methods have been proposed to improve the robustness of neural networks on adversarial examples. Most of these provable defense methods treat all examples equally during training process. In this paper, we explore this inconsistency caused by misclassified examples and add a novel consistency regularization term to make better use of the misclassified examples.
arXiv Detail & Related papers (2020-12-24T05:00:50Z)
About contrastive unsupervised representation learning for classification and its convergence [2.6782615615913343]
We build a theoretical framework around contrastive learning which guarantees for its performance can be proven. We provide extensions of these results to training with multiple negative samples and for multiway classification. We also provide convergence guarantees for the minimization of the contrastive training error with gradient descent of an overparametrized deep neural encoder.
arXiv Detail & Related papers (2020-12-02T10:08:57Z)
Regularized Training and Tight Certification for Randomized Smoothed Classifier with Provable Robustness [15.38718018477333]
We derive a new regularized risk, in which the regularizer can adaptively encourage the accuracy and robustness of the smoothed counterpart. We also design a new certification algorithm, which can leverage the regularization effect to provide tighter robustness lower bound that holds with high probability.
arXiv Detail & Related papers (2020-02-17T20:54:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.