Classifier Robustness Enhancement Via Test-Time Transformation
- URL: http://arxiv.org/abs/2303.15409v1
- Date: Mon, 27 Mar 2023 17:28:20 GMT
- Title: Classifier Robustness Enhancement Via Test-Time Transformation
- Authors: Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad and Alex Bronstein
- Abstract summary: Adrial training is currently the best-known way to achieve classification under adversarial attacks.
In this work, we introduce Robustness Enhancement Via Test-Time Transformation (TETRA) -- a novel defense method.
We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments.
- Score: 14.603209216642034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been recently discovered that adversarially trained classifiers
exhibit an intriguing property, referred to as perceptually aligned gradients
(PAG). PAG implies that the gradients of such classifiers possess a meaningful
structure, aligned with human perception. Adversarial training is currently the
best-known way to achieve classification robustness under adversarial attacks.
The PAG property, however, has yet to be leveraged for further improving
classifier robustness. In this work, we introduce Classifier Robustness
Enhancement Via Test-Time Transformation (TETRA) -- a novel defense method that
utilizes PAG, enhancing the performance of trained robust classifiers. Our
method operates in two phases. First, it modifies the input image via a
designated targeted adversarial attack into each of the dataset's classes.
Then, it classifies the input image based on the distance to each of the
modified instances, with the assumption that the shortest distance relates to
the true class. We show that the proposed method achieves state-of-the-art
results and validate our claim through extensive experiments on a variety of
defense methods, classifier architectures, and datasets. We also empirically
demonstrate that TETRA can boost the accuracy of any differentiable adversarial
training classifier across a variety of attacks, including ones unseen at
training. Specifically, applying TETRA leads to substantial improvement of up
to $+23\%$, $+20\%$, and $+26\%$ on CIFAR10, CIFAR100, and ImageNet,
respectively.
Related papers
- ICAS: Detecting Training Data from Autoregressive Image Generative Models [38.1625974271413]
Training data detection has emerged as a critical task for identifying unauthorized data usage in model training.<n>We conduct the first study applying membership inference to this domain.<n>Our approach exhibits strong robustness and generalization under various data transformations.
arXiv Detail & Related papers (2025-07-07T14:50:42Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification [10.911464455072391]
FACTUAL is a Contrastive Learning framework for Adversarial Training and robust SAR classification.
Our model achieves 99.7% accuracy on clean samples, and 89.6% on perturbed samples, both outperforming previous state-of-the-art methods.
arXiv Detail & Related papers (2024-04-04T06:20:22Z) - Attackar: Attack of the Evolutionary Adversary [0.0]
This paper introduces textitAttackar, an evolutionary, score-based, black-box attack.
Attackar is based on a novel objective function that can be used in gradient-free optimization problems.
Our results demonstrate the superior performance of Attackar, both in terms of accuracy score and query efficiency.
arXiv Detail & Related papers (2022-08-17T13:57:23Z) - Threat Model-Agnostic Adversarial Defense using Diffusion Models [14.603209216642034]
Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks.
Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks.
arXiv Detail & Related papers (2022-07-17T06:50:48Z) - Towards Alternative Techniques for Improving Adversarial Robustness:
Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations.
We focus on models, trained on a spectrum of $epsilon$ values.
We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Robust Binary Models by Pruning Randomly-initialized Networks [57.03100916030444]
We propose ways to obtain robust models against adversarial attacks from randomly-d binary networks.
We learn the structure of the robust model by pruning a randomly-d binary network.
Our method confirms the strong lottery ticket hypothesis in the presence of adversarial attacks.
arXiv Detail & Related papers (2022-02-03T00:05:08Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - RAIN: A Simple Approach for Robust and Accurate Image Classification
Networks [156.09526491791772]
It has been shown that the majority of existing adversarial defense methods achieve robustness at the cost of sacrificing prediction accuracy.
This paper proposes a novel preprocessing framework, which we term Robust and Accurate Image classificatioN(RAIN)
RAIN applies randomization over inputs to break the ties between the model forward prediction path and the backward gradient path, thus improving the model robustness.
We conduct extensive experiments on the STL10 and ImageNet datasets to verify the effectiveness of RAIN against various types of adversarial attacks.
arXiv Detail & Related papers (2020-04-24T02:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.