Adversarial Visual Robustness by Causal Intervention
- URL: http://arxiv.org/abs/2106.09534v1
- Date: Thu, 17 Jun 2021 14:23:54 GMT
- Title: Adversarial Visual Robustness by Causal Intervention
- Authors: Kaihua Tang, Mingyuan Tao, Hanwang Zhang
- Abstract summary: Adversarial training is the de facto most promising defense against adversarial examples.
Yet, its passive nature inevitably prevents it from being immune to unknown attackers.
We provide a causal viewpoint of adversarial vulnerability: the cause is the confounder ubiquitously existing in learning.
- Score: 56.766342028800445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training is the de facto most promising defense against
adversarial examples. Yet, its passive nature inevitably prevents it from being
immune to unknown attackers. To achieve a proactive defense, we need a more
fundamental understanding of adversarial examples, beyond the popular bounded
threat model. In this paper, we provide a causal viewpoint of adversarial
vulnerability: the cause is the confounder ubiquitously existing in learning,
where attackers are precisely exploiting the confounding effect. Therefore, a
fundamental solution for adversarial robustness is causal intervention. As the
confounder is unobserved in general, we propose to use the instrumental
variable that achieves intervention without the need for confounder
observation. We term our robust training method as Causal intervention by
instrumental Variable (CiiV). It has a differentiable retinotopic sampling
layer and a consistency loss, which is stable and guaranteed not to suffer from
gradient obfuscation. Extensive experiments on a wide spectrum of attackers and
settings applied in MNIST, CIFAR-10, and mini-ImageNet datasets empirically
demonstrate that CiiV is robust to adaptive attacks.
Related papers
- Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Mitigating Adversarial Vulnerability through Causal Parameter Estimation
by Adversarial Double Machine Learning [33.18197518590706]
Adversarial examples derived from deliberately crafted perturbations on visual inputs can easily harm decision process of deep neural networks.
We introduce a causal approach called Adversarial Double Machine Learning (ADML) which allows us to quantify the degree of adversarial vulnerability for network predictions.
ADML can directly estimate causal parameter of adversarial perturbations per se and mitigate negative effects that can potentially damage robustness.
arXiv Detail & Related papers (2023-07-14T09:51:26Z) - IDEA: Invariant Defense for Graph Adversarial Robustness [60.0126873387533]
We propose an Invariant causal DEfense method against adversarial Attacks (IDEA)
We derive node-based and structure-based invariance objectives from an information-theoretic perspective.
Experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets.
arXiv Detail & Related papers (2023-05-25T07:16:00Z) - Demystifying Causal Features on Adversarial Examples and Causal
Inoculation for Robust Network by Adversarial Instrumental Variable
Regression [32.727673706238086]
We propose a way of delving into the unexpected vulnerability in adversarially trained networks from a causal perspective.
By deploying it, we estimate the causal relation of adversarial prediction under an unbiased environment.
We demonstrate that the estimated causal features are highly related to the correct prediction for adversarial robustness.
arXiv Detail & Related papers (2023-03-02T08:18:22Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Formulating Robustness Against Unforeseen Attacks [34.302333899025044]
This paper focuses on the scenario where there is a mismatch in the threat model assumed by the defense during training.
We ask the question: if the learner trains against a specific "source" threat model, when can we expect robustness to generalize to a stronger unknown "target" threat model during test-time?
We propose adversarial training with variation regularization (AT-VR) which reduces variation of the feature extractor across the source threat model during training.
arXiv Detail & Related papers (2022-04-28T21:03:36Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Towards Robustness against Unsuspicious Adversarial Examples [33.63338857434094]
We propose an approach for modeling suspiciousness by leveraging cognitive salience.
We compute the resulting non-salience-preserving dual-perturbation attacks on classifiers.
We show that adversarial training with dual-perturbation attacks yields classifiers that are more robust to these than state-of-the-art robust learning approaches.
arXiv Detail & Related papers (2020-05-08T20:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.