Causal Information Bottleneck Boosts Adversarial Robustness of Deep
Neural Network
- URL: http://arxiv.org/abs/2210.14229v1
- Date: Tue, 25 Oct 2022 12:49:36 GMT
- Title: Causal Information Bottleneck Boosts Adversarial Robustness of Deep
Neural Network
- Authors: Huan Hua, Jun Yan, Xi Fang, Weiquan Huang, Huilin Yin and Wancheng Ge
- Abstract summary: The information bottleneck (IB) method is a feasible defense solution against adversarial attacks in deep learning.
We incorporate the causal inference into the IB framework to alleviate such a problem.
Our method exhibits the considerable robustness against multiple adversarial attacks.
- Score: 3.819052032134146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The information bottleneck (IB) method is a feasible defense solution against
adversarial attacks in deep learning. However, this method suffers from the
spurious correlation, which leads to the limitation of its further improvement
of adversarial robustness. In this paper, we incorporate the causal inference
into the IB framework to alleviate such a problem. Specifically, we divide the
features obtained by the IB method into robust features (content information)
and non-robust features (style information) via the instrumental variables to
estimate the causal effects. With the utilization of such a framework, the
influence of non-robust features could be mitigated to strengthen the
adversarial robustness. We make an analysis of the effectiveness of our
proposed method. The extensive experiments in MNIST, FashionMNIST, and CIFAR-10
show that our method exhibits the considerable robustness against multiple
adversarial attacks. Our code would be released.
Related papers
- Enhancing Adversarial Transferability via Information Bottleneck Constraints [18.363276470822427]
We propose a framework for performing black-box transferable adversarial attacks named IBTA.
To overcome the challenge of unoptimizable mutual information, we propose a simple and efficient mutual information lower bound (MILB) for approximating computation.
Our experiments on the ImageNet dataset well demonstrate the efficiency and scalability of IBTA and derived MILB.
arXiv Detail & Related papers (2024-06-08T17:25:31Z) - READ: Improving Relation Extraction from an ADversarial Perspective [33.44949503459933]
We propose an adversarial training method specifically designed for relation extraction (RE)
Our approach introduces both sequence- and token-level perturbations to the sample and uses a separate perturbation vocabulary to improve the search for entity and context perturbations.
arXiv Detail & Related papers (2024-04-02T16:42:44Z) - Class Incremental Learning for Adversarial Robustness [17.06592851567578]
Adrial training integrates adversarial examples during model training to enhance robustness.
We observe that combining incremental learning with naive adversarial training easily leads to a loss of robustness.
We propose the Flatness Preserving Distillation (FPD) loss that leverages the output difference between adversarial and clean examples.
arXiv Detail & Related papers (2023-12-06T04:38:02Z) - On the Onset of Robust Overfitting in Adversarial Training [66.27055915739331]
Adversarial Training (AT) is a widely-used algorithm for building robust neural networks.
AT suffers from the issue of robust overfitting, the fundamental mechanism of which remains unclear.
arXiv Detail & Related papers (2023-10-01T07:57:03Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Bayesian Learning with Information Gain Provably Bounds Risk for a
Robust Adversarial Defense [27.545466364906773]
We present a new algorithm to learn a deep neural network model robust against adversarial attacks.
Our model demonstrate significantly improved robustness--up to 20%--compared with adversarial training and Adv-BNN under PGD attacks.
arXiv Detail & Related papers (2022-12-05T03:26:08Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Adversarial Visual Robustness by Causal Intervention [56.766342028800445]
Adversarial training is the de facto most promising defense against adversarial examples.
Yet, its passive nature inevitably prevents it from being immune to unknown attackers.
We provide a causal viewpoint of adversarial vulnerability: the cause is the confounder ubiquitously existing in learning.
arXiv Detail & Related papers (2021-06-17T14:23:54Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.