Sparse Coding Frontend for Robust Neural Networks
- URL: http://arxiv.org/abs/2104.05353v1
- Date: Mon, 12 Apr 2021 11:14:32 GMT
- Title: Sparse Coding Frontend for Robust Neural Networks
- Authors: Can Bakiskan, Metehan Cekic, Ahmet Dundar Sezer, Upamanyu Madhow
- Abstract summary: Deep Neural Networks are known to be vulnerable to small, adversarially crafted, perturbations.
Current defense methods against these adversarial attacks are variants of adversarial training.
In this paper, we introduce a radically different defense based on a sparse coding based on clean images.
- Score: 11.36192454455449
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep Neural Networks are known to be vulnerable to small, adversarially
crafted, perturbations. The current most effective defense methods against
these adversarial attacks are variants of adversarial training. In this paper,
we introduce a radically different defense trained only on clean images: a
sparse coding based frontend which significantly attenuates adversarial attacks
before they reach the classifier. We evaluate our defense on CIFAR-10 dataset
under a wide range of attack types (including Linf , L2, and L1 bounded
attacks), demonstrating its promise as a general-purpose approach for defense.
Related papers
- Edge-Only Universal Adversarial Attacks in Distributed Learning [49.546479320670464]
In this work, we explore the feasibility of generating universal adversarial attacks when an attacker has access to the edge part of the model only.
Our approach shows that adversaries can induce effective mispredictions in the unknown cloud part by leveraging key features on the edge side.
Our results on ImageNet demonstrate strong attack transferability to the unknown cloud part.
arXiv Detail & Related papers (2024-11-15T11:06:24Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - MPAT: Building Robust Deep Neural Networks against Textual Adversarial
Attacks [4.208423642716679]
We propose a malicious perturbation based adversarial training method (MPAT) for building robust deep neural networks against adversarial attacks.
Specifically, we construct a multi-level malicious example generation strategy to generate adversarial examples with malicious perturbations.
We employ a novel training objective function to ensure achieving the defense goal without compromising the performance on the original task.
arXiv Detail & Related papers (2024-02-29T01:49:18Z) - Adversary Aware Continual Learning [3.3439097577935213]
Adversary can introduce small amount of misinformation to the model to cause deliberate forgetting of a specific task or class at test time.
We use the attacker's primary strength-hiding the backdoor pattern by making it imperceptible to humans-against it, and propose to learn a perceptible (stronger) pattern that can overpower the attacker's imperceptible pattern.
We show that our proposed defensive framework considerably improves the performance of class incremental learning algorithms with no knowledge of the attacker's target task, attacker's target class, and attacker's imperceptible pattern.
arXiv Detail & Related papers (2023-04-27T19:49:50Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Universal Adversarial Training with Class-Wise Perturbations [78.05383266222285]
adversarial training is the most widely used method for defending against adversarial attacks.
In this work, we find that a UAP does not attack all classes equally.
We improve the SOTA UAT by proposing to utilize class-wise UAPs during adversarial training.
arXiv Detail & Related papers (2021-04-07T09:05:49Z) - Mitigating Gradient-based Adversarial Attacks via Denoising and
Compression [7.305019142196582]
Gradient-based adversarial attacks on deep neural networks pose a serious threat.
They can be deployed by adding imperceptible perturbations to the test data of any network.
Denoising and dimensionality reduction are two distinct methods that have been investigated to combat such attacks.
arXiv Detail & Related papers (2021-04-03T22:57:01Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field.
Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.