Causal Information Bottleneck Boosts Adversarial Robustness of Deep
Neural Network
- URL: http://arxiv.org/abs/2210.14229v1
- Date: Tue, 25 Oct 2022 12:49:36 GMT
- Title: Causal Information Bottleneck Boosts Adversarial Robustness of Deep
Neural Network
- Authors: Huan Hua, Jun Yan, Xi Fang, Weiquan Huang, Huilin Yin and Wancheng Ge
- Abstract summary: The information bottleneck (IB) method is a feasible defense solution against adversarial attacks in deep learning.
We incorporate the causal inference into the IB framework to alleviate such a problem.
Our method exhibits the considerable robustness against multiple adversarial attacks.
- Score: 3.819052032134146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The information bottleneck (IB) method is a feasible defense solution against
adversarial attacks in deep learning. However, this method suffers from the
spurious correlation, which leads to the limitation of its further improvement
of adversarial robustness. In this paper, we incorporate the causal inference
into the IB framework to alleviate such a problem. Specifically, we divide the
features obtained by the IB method into robust features (content information)
and non-robust features (style information) via the instrumental variables to
estimate the causal effects. With the utilization of such a framework, the
influence of non-robust features could be mitigated to strengthen the
adversarial robustness. We make an analysis of the effectiveness of our
proposed method. The extensive experiments in MNIST, FashionMNIST, and CIFAR-10
show that our method exhibits the considerable robustness against multiple
adversarial attacks. Our code would be released.
Related papers
- Enhancing Adversarial Transferability via Information Bottleneck Constraints [18.363276470822427]
We propose a framework for performing black-box transferable adversarial attacks named IBTA.
To overcome the challenge of unoptimizable mutual information, we propose a simple and efficient mutual information lower bound (MILB) for approximating computation.
Our experiments on the ImageNet dataset well demonstrate the efficiency and scalability of IBTA and derived MILB.
arXiv Detail & Related papers (2024-06-08T17:25:31Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.
In this paper, we argue that the inherentity in the DBP process is the primary driver of its robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - READ: Improving Relation Extraction from an ADversarial Perspective [33.44949503459933]
We propose an adversarial training method specifically designed for relation extraction (RE)
Our approach introduces both sequence- and token-level perturbations to the sample and uses a separate perturbation vocabulary to improve the search for entity and context perturbations.
arXiv Detail & Related papers (2024-04-02T16:42:44Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Mitigating Adversarial Vulnerability through Causal Parameter Estimation
by Adversarial Double Machine Learning [33.18197518590706]
Adversarial examples derived from deliberately crafted perturbations on visual inputs can easily harm decision process of deep neural networks.
We introduce a causal approach called Adversarial Double Machine Learning (ADML) which allows us to quantify the degree of adversarial vulnerability for network predictions.
ADML can directly estimate causal parameter of adversarial perturbations per se and mitigate negative effects that can potentially damage robustness.
arXiv Detail & Related papers (2023-07-14T09:51:26Z) - Bayesian Learning with Information Gain Provably Bounds Risk for a
Robust Adversarial Defense [27.545466364906773]
We present a new algorithm to learn a deep neural network model robust against adversarial attacks.
Our model demonstrate significantly improved robustness--up to 20%--compared with adversarial training and Adv-BNN under PGD attacks.
arXiv Detail & Related papers (2022-12-05T03:26:08Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Adversarial Visual Robustness by Causal Intervention [56.766342028800445]
Adversarial training is the de facto most promising defense against adversarial examples.
Yet, its passive nature inevitably prevents it from being immune to unknown attackers.
We provide a causal viewpoint of adversarial vulnerability: the cause is the confounder ubiquitously existing in learning.
arXiv Detail & Related papers (2021-06-17T14:23:54Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Contextual Fusion For Adversarial Robustness [0.0]
Deep neural networks are usually designed to process one particular information stream and susceptible to various types of adversarial perturbations.
We developed a fusion model using a combination of background and foreground features extracted in parallel from Places-CNN and Imagenet-CNN.
For gradient based attacks, our results show that fusion allows for significant improvements in classification without decreasing performance on unperturbed data.
arXiv Detail & Related papers (2020-11-18T20:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.