Classifier Robustness Enhancement Via Test-Time Transformation
- URL: http://arxiv.org/abs/2303.15409v1
- Date: Mon, 27 Mar 2023 17:28:20 GMT
- Title: Classifier Robustness Enhancement Via Test-Time Transformation
- Authors: Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad and Alex Bronstein
- Abstract summary: Adrial training is currently the best-known way to achieve classification under adversarial attacks.
In this work, we introduce Robustness Enhancement Via Test-Time Transformation (TETRA) -- a novel defense method.
We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments.
- Score: 14.603209216642034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been recently discovered that adversarially trained classifiers
exhibit an intriguing property, referred to as perceptually aligned gradients
(PAG). PAG implies that the gradients of such classifiers possess a meaningful
structure, aligned with human perception. Adversarial training is currently the
best-known way to achieve classification robustness under adversarial attacks.
The PAG property, however, has yet to be leveraged for further improving
classifier robustness. In this work, we introduce Classifier Robustness
Enhancement Via Test-Time Transformation (TETRA) -- a novel defense method that
utilizes PAG, enhancing the performance of trained robust classifiers. Our
method operates in two phases. First, it modifies the input image via a
designated targeted adversarial attack into each of the dataset's classes.
Then, it classifies the input image based on the distance to each of the
modified instances, with the assumption that the shortest distance relates to
the true class. We show that the proposed method achieves state-of-the-art
results and validate our claim through extensive experiments on a variety of
defense methods, classifier architectures, and datasets. We also empirically
demonstrate that TETRA can boost the accuracy of any differentiable adversarial
training classifier across a variety of attacks, including ones unseen at
training. Specifically, applying TETRA leads to substantial improvement of up
to $+23\%$, $+20\%$, and $+26\%$ on CIFAR10, CIFAR100, and ImageNet,
respectively.
Related papers
- Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - Towards Robust Domain Generation Algorithm Classification [1.4542411354617986]
We implement 32 white-box attacks, 19 of which are very effective and induce a false-negative rate (FNR) of $approx$ 100% on unhardened classifiers.
We propose a novel training scheme that leverages adversarial latent space vectors and discretized adversarial domains to significantly improve robustness.
arXiv Detail & Related papers (2024-04-09T11:56:29Z) - Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge.
We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks.
Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z) - Activate and Reject: Towards Safe Domain Generalization under Category
Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS)
It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains.
Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z) - Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by
Rewriting Text [40.491180210205556]
We present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial.
Our experiments reveal that ATINTER is effective at providing better adversarial robustness than existing defense approaches.
arXiv Detail & Related papers (2023-05-25T19:42:51Z) - Carefully Blending Adversarial Training and Purification Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences.
Our method improves by a significant margin the state-of-the-art for CIFAR-10, CIFAR-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z) - PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function [13.417003144007156]
adversarial attacks tend to rely on the principle of transferability.
Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers.
Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
arXiv Detail & Related papers (2021-12-09T14:26:13Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Optimal Transport as a Defense Against Adversarial Attacks [4.6193503399184275]
Adversarial attacks can find a human-imperceptible perturbation for a given image that will mislead a trained model.
Previous work aimed to align original and adversarial image representations in the same way as domain adaptation to improve robustness.
We propose to use a loss between distributions that faithfully reflect the ground distance.
This leads to SAT (Sinkhorn Adversarial Training), a more robust defense against adversarial attacks.
arXiv Detail & Related papers (2021-02-05T13:24:36Z) - Learning and Evaluating Representations for Deep One-class
Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification.
We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.
In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.