REGroup: Rank-aggregating Ensemble of Generative Classifiers for Robust
Predictions
- URL: http://arxiv.org/abs/2006.10679v2
- Date: Wed, 24 Nov 2021 11:00:57 GMT
- Title: REGroup: Rank-aggregating Ensemble of Generative Classifiers for Robust
Predictions
- Authors: Lokender Tiwari, Anish Madan, Saket Anand, Subhashis Banerjee
- Abstract summary: Defense strategies that adopt adversarial training or random input transformations typically require retraining or fine-tuning the model to achieve reasonable performance.
We find that we can learn a generative classifier by statistically characterizing the neural response of an intermediate layer to clean training samples.
Our proposed approach uses a subset of the clean training data and a pre-trained model, and yet is agnostic to network architectures or the adversarial attack generation method.
- Score: 6.0162772063289784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are often criticized for being susceptible to
adversarial attacks. Most successful defense strategies adopt adversarial
training or random input transformations that typically require retraining or
fine-tuning the model to achieve reasonable performance. In this work, our
investigations of intermediate representations of a pre-trained DNN lead to an
interesting discovery pointing to intrinsic robustness to adversarial attacks.
We find that we can learn a generative classifier by statistically
characterizing the neural response of an intermediate layer to clean training
samples. The predictions of multiple such intermediate-layer based classifiers,
when aggregated, show unexpected robustness to adversarial attacks.
Specifically, we devise an ensemble of these generative classifiers that
rank-aggregates their predictions via a Borda count-based consensus. Our
proposed approach uses a subset of the clean training data and a pre-trained
model, and yet is agnostic to network architectures or the adversarial attack
generation method. We show extensive experiments to establish that our defense
strategy achieves state-of-the-art performance on the ImageNet validation set.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and
Boosting Segmentation Robustness [63.726895965125145]
Deep neural network-based image classifications are vulnerable to adversarial perturbations.
In this work, we propose an effective and efficient segmentation attack method, dubbed SegPGD.
Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models.
arXiv Detail & Related papers (2022-07-25T17:56:54Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Learning from Attacks: Attacking Variational Autoencoder for Improving
Image Classification [17.881134865491063]
Adversarial attacks are often considered as threats to the robustness of Deep Neural Networks (DNNs)
This work analyzes adversarial attacks from a different perspective. Namely, adversarial examples contain implicit information that is useful to the predictions.
We propose an algorithmic framework that leverages the advantages of the DNNs for data self-expression and task-specific predictions.
arXiv Detail & Related papers (2022-03-11T08:48:26Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Defensive Tensorization [113.96183766922393]
We propose tensor defensiveization, an adversarial defence technique that leverages a latent high-order factorization of the network.
We empirically demonstrate the effectiveness of our approach on standard image classification benchmarks.
We validate the versatility of our approach across domains and low-precision architectures by considering an audio task and binary networks.
arXiv Detail & Related papers (2021-10-26T17:00:16Z) - Class-Aware Domain Adaptation for Improving Adversarial Robustness [27.24720754239852]
adversarial training has been proposed to train networks by injecting adversarial examples into the training data.
We propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training.
arXiv Detail & Related papers (2020-05-10T03:45:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.