A Spectral Perspective towards Understanding and Improving Adversarial
Robustness
- URL: http://arxiv.org/abs/2306.14262v1
- Date: Sun, 25 Jun 2023 14:47:03 GMT
- Title: A Spectral Perspective towards Understanding and Improving Adversarial
Robustness
- Authors: Binxiao Huang, Rui Lin, Chaofan Tao, Ngai Wong
- Abstract summary: adversarial training (AT) has proven to be an effective defense approach, but mechanism for robustness improvement is not fully understood.
We show that AT induces the deep model to focus more on the low-frequency region, which retains the shape-biased representations, to gain robustness.
We propose a spectral alignment regularization (SAR) such that the spectral output inferred by an attacked adversarial input stays as close as possible to its natural input counterpart.
- Score: 8.912245110734334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are incredibly vulnerable to crafted,
imperceptible adversarial perturbations. While adversarial training (AT) has
proven to be an effective defense approach, the AT mechanism for robustness
improvement is not fully understood. This work investigates AT from a spectral
perspective, adding new insights to the design of effective defenses. In
particular, we show that AT induces the deep model to focus more on the
low-frequency region, which retains the shape-biased representations, to gain
robustness. Further, we find that the spectrum of a white-box attack is
primarily distributed in regions the model focuses on, and the perturbation
attacks the spectral bands where the model is vulnerable. Based on this
observation, to train a model tolerant to frequency-varying perturbation, we
propose a spectral alignment regularization (SAR) such that the spectral output
inferred by an attacked adversarial input stays as close as possible to its
natural input counterpart. Experiments demonstrate that SAR and its weight
averaging (WA) extension could significantly improve the robust accuracy by
1.14% ~ 3.87% relative to the standard AT, across multiple datasets (CIFAR-10,
CIFAR-100 and Tiny ImageNet), and various attacks (PGD, C&W and Autoattack),
without any extra data.
Related papers
- DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain [23.678658814438855]
adversarial training (AT) is developed to protect deep neural networks (DNNs) from adversarial attacks.
Recent studies show that adversarial attacks disproportionately impact the patterns within the phase of the sample's frequency spectrum.
We propose an optimized Adversarial Amplitude Generator (AAG) to achieve a better tradeoff between improving the model's robustness and retaining phase patterns.
arXiv Detail & Related papers (2024-10-16T07:18:36Z) - Filtered Randomized Smoothing: A New Defense for Robust Modulation Classification [16.974803642923465]
We study the problem of designing robust modulation classifiers that can provide provable defense against arbitrary attacks.
We propose Filtered Randomized Smoothing (FRS), a novel defense which combines spectral filtering together with randomized smoothing.
We show that FRS significantly outperforms existing defenses including AT and RS in terms of accuracy on both attacked and benign signals.
arXiv Detail & Related papers (2024-10-08T20:17:25Z) - Carefully Blending Adversarial Training and Purification Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences.
Our method improves by a significant margin the state-of-the-art for CIFAR-10, CIFAR-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z) - Frequency Regularization for Improving Adversarial Robustness [8.912245110734334]
adversarial training (AT) has proven to be an effective defense approach.
We propose a frequency regularization (FR) to align the output difference in the spectral domain.
We find that our method achieves the strongest robustness against attacks by PGD-20, C&W and Autoattack.
arXiv Detail & Related papers (2022-12-24T13:14:45Z) - Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive
Diffusion [70.60038549155485]
Deep 3D point cloud models are sensitive to adversarial attacks, which poses threats to safety-critical applications such as autonomous driving.
This paper introduces a novel distortion-aware defense framework that can rebuild the pristine data distribution with a tailored intensity estimator and a diffusion model.
arXiv Detail & Related papers (2022-11-29T14:32:43Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models.
In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure.
We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.