Mitigating Adversarial Vulnerability through Causal Parameter Estimation
by Adversarial Double Machine Learning
- URL: http://arxiv.org/abs/2307.07250v2
- Date: Tue, 18 Jul 2023 07:31:34 GMT
- Title: Mitigating Adversarial Vulnerability through Causal Parameter Estimation
by Adversarial Double Machine Learning
- Authors: Byung-Kwan Lee, Junho Kim, Yong Man Ro
- Abstract summary: Adversarial examples derived from deliberately crafted perturbations on visual inputs can easily harm decision process of deep neural networks.
We introduce a causal approach called Adversarial Double Machine Learning (ADML) which allows us to quantify the degree of adversarial vulnerability for network predictions.
ADML can directly estimate causal parameter of adversarial perturbations per se and mitigate negative effects that can potentially damage robustness.
- Score: 33.18197518590706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples derived from deliberately crafted perturbations on
visual inputs can easily harm decision process of deep neural networks. To
prevent potential threats, various adversarial training-based defense methods
have grown rapidly and become a de facto standard approach for robustness.
Despite recent competitive achievements, we observe that adversarial
vulnerability varies across targets and certain vulnerabilities remain
prevalent. Intriguingly, such peculiar phenomenon cannot be relieved even with
deeper architectures and advanced defense methods. To address this issue, in
this paper, we introduce a causal approach called Adversarial Double Machine
Learning (ADML), which allows us to quantify the degree of adversarial
vulnerability for network predictions and capture the effect of treatments on
outcome of interests. ADML can directly estimate causal parameter of
adversarial perturbations per se and mitigate negative effects that can
potentially damage robustness, bridging a causal perspective into the
adversarial vulnerability. Through extensive experiments on various CNN and
Transformer architectures, we corroborate that ADML improves adversarial
robustness with large margins and relieve the empirical observation.
Related papers
- Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis [12.133306321357999]
We propose an uncertainty-based method for detecting adversarial attacks on neural networks for semantic segmentation.
We conduct a detailed analysis of uncertainty-based detection of adversarial attacks and various state-of-the-art neural networks.
Our numerical experiments show the effectiveness of the proposed uncertainty-based detection method.
arXiv Detail & Related papers (2024-08-19T14:13:30Z) - Investigating and unmasking feature-level vulnerabilities of CNNs to adversarial perturbations [3.4530027457862]
This study explores the impact of adversarial perturbations on Convolutional Neural Networks (CNNs)
We introduce the Adversarial Intervention framework to study the vulnerability of a CNN to adversarial perturbations.
arXiv Detail & Related papers (2024-05-31T08:14:44Z) - Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks.
In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Consistent Valid Physically-Realizable Adversarial Attack against
Crowd-flow Prediction Models [4.286570387250455]
deep learning (DL) models can effectively learn city-wide crowd-flow patterns.
DL models have been known to perform poorly on inconspicuous adversarial perturbations.
arXiv Detail & Related papers (2023-03-05T13:30:25Z) - Masking Adversarial Damage: Finding Adversarial Saliency for Robust and
Sparse Network [33.18197518590706]
Adversarial examples provoke weak reliability and potential security issues in deep neural networks.
We propose a novel adversarial pruning method, Masking Adversarial Damage (MAD) that employs second-order information of adversarial loss.
We show that MAD effectively prunes adversarially trained networks without loosing adversarial robustness and shows better performance than previous adversarial pruning methods.
arXiv Detail & Related papers (2022-04-06T11:28:06Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Adversarial Visual Robustness by Causal Intervention [56.766342028800445]
Adversarial training is the de facto most promising defense against adversarial examples.
Yet, its passive nature inevitably prevents it from being immune to unknown attackers.
We provide a causal viewpoint of adversarial vulnerability: the cause is the confounder ubiquitously existing in learning.
arXiv Detail & Related papers (2021-06-17T14:23:54Z) - Recent Advances in Understanding Adversarial Robustness of Deep Neural
Networks [15.217367754000913]
It is increasingly important to obtain models with high robustness that are resistant to adversarial examples.
We give preliminary definitions on what adversarial attacks and robustness are.
We study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness.
arXiv Detail & Related papers (2020-11-03T07:42:53Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.