LiBRe: A Practical Bayesian Approach to Adversarial Detection
- URL: http://arxiv.org/abs/2103.14835v1
- Date: Sat, 27 Mar 2021 07:48:58 GMT
- Title: LiBRe: A Practical Bayesian Approach to Adversarial Detection
- Authors: Zhijie Deng, Xiao Yang, Shizhen Xu, Hang Su, Jun Zhu
- Abstract summary: LiBRe can endow a variety of pre-trained task-dependent DNNs with the ability of defending heterogeneous adversarial attacks at a low cost.
We build the few-layer deep ensemble variational and adopt the pre-training & fine-tuning workflow to boost the effectiveness and efficiency of LiBRe.
- Score: 36.541671795530625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their appealing flexibility, deep neural networks (DNNs) are
vulnerable against adversarial examples. Various adversarial defense strategies
have been proposed to resolve this problem, but they typically demonstrate
restricted practicability owing to unsurmountable compromise on universality,
effectiveness, or efficiency. In this work, we propose a more practical
approach, Lightweight Bayesian Refinement (LiBRe), in the spirit of leveraging
Bayesian neural networks (BNNs) for adversarial detection. Empowered by the
task and attack agnostic modeling under Bayes principle, LiBRe can endow a
variety of pre-trained task-dependent DNNs with the ability of defending
heterogeneous adversarial attacks at a low cost. We develop and integrate
advanced learning techniques to make LiBRe appropriate for adversarial
detection. Concretely, we build the few-layer deep ensemble variational and
adopt the pre-training & fine-tuning workflow to boost the effectiveness and
efficiency of LiBRe. We further provide a novel insight to realise adversarial
detection-oriented uncertainty quantification without inefficiently crafting
adversarial examples during training. Extensive empirical studies covering a
wide range of scenarios verify the practicability of LiBRe. We also conduct
thorough ablation studies to evidence the superiority of our modeling and
learning strategies.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - On Using Certified Training towards Empirical Robustness [40.582830117229854]
We show that a certified training algorithm can prevent catastrophic overfitting on single-step attacks.
We also present a novel regularizer for network over-approximations that can achieve similar effects while markedly reducing runtime.
arXiv Detail & Related papers (2024-10-02T14:56:21Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.