Adversarial Robustness of Stabilized NeuralODEs Might be from Obfuscated
Gradients
- URL: http://arxiv.org/abs/2009.13145v2
- Date: Wed, 2 Jun 2021 04:14:08 GMT
- Title: Adversarial Robustness of Stabilized NeuralODEs Might be from Obfuscated
Gradients
- Authors: Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, Yuan Yao
- Abstract summary: We introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box attacks.
Inspired by dynamical system theory, we design a neural stabilized ODE network named SONet whose ODE blocks are skew-symmetric and proved to be input-output stable.
With natural training, SONet can achieve comparable robustness with the state-of-the-art adversarial defense methods, without sacrificing natural accuracy.
- Score: 30.560531008995806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we introduce a provably stable architecture for Neural Ordinary
Differential Equations (ODEs) which achieves non-trivial adversarial robustness
under white-box adversarial attacks even when the network is trained naturally.
For most existing defense methods withstanding strong white-box attacks, to
improve robustness of neural networks, they need to be trained adversarially,
hence have to strike a trade-off between natural accuracy and adversarial
robustness. Inspired by dynamical system theory, we design a stabilized neural
ODE network named SONet whose ODE blocks are skew-symmetric and proved to be
input-output stable. With natural training, SONet can achieve comparable
robustness with the state-of-the-art adversarial defense methods, without
sacrificing natural accuracy. Even replacing only the first layer of a ResNet
by such a ODE block can exhibit further improvement in robustness, e.g., under
PGD-20 ($\ell_\infty=0.031$) attack on CIFAR-10 dataset, it achieves 91.57\%
and natural accuracy and 62.35\% robust accuracy, while a counterpart
architecture of ResNet trained with TRADES achieves natural and robust accuracy
76.29\% and 45.24\%, respectively. To understand possible reasons behind this
surprisingly good result, we further explore the possible mechanism underlying
such an adversarial robustness. We show that the adaptive stepsize numerical
ODE solver, DOPRI5, has a gradient masking effect that fails the PGD attacks
which are sensitive to gradient information of training loss; on the other
hand, it cannot fool the CW attack of robust gradients and the SPSA attack that
is gradient-free. This provides a new explanation that the adversarial
robustness of ODE-based networks mainly comes from the obfuscated gradients in
numerical ODE solvers.
Related papers
- Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness [47.9744734181236]
We explore the concept of Lipschitz continuity to certify the robustness of deep neural networks (DNNs) against adversarial attacks.
We propose a novel algorithm that remaps the input domain into a constrained range, reducing the Lipschitz constant and potentially enhancing robustness.
Our method achieves the best robust accuracy for CIFAR10, CIFAR100, and ImageNet datasets on the RobustBench leaderboard.
arXiv Detail & Related papers (2024-06-28T03:10:36Z) - How Robust Are Energy-Based Models Trained With Equilibrium Propagation? [4.374837991804085]
Adrial training is the current state-of-the-art defense against adversarial attacks.
It lowers the model's accuracy on clean inputs, is computationally expensive, and offers less robustness to natural noise.
In contrast, energy-based models (EBMs) incorporate feedback connections from each layer to the previous layer, yielding a recurrent, deep-attractor architecture.
arXiv Detail & Related papers (2024-01-21T16:55:40Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending
Against Adversarial Attacks [32.88499015927756]
We propose a stable neural ODE with Lyapunov-stable equilibrium points for defending against adversarial attacks (SODEF)
We provide theoretical results that give insights into the stability of SODEF as well as the choice of regularizers to ensure its stability.
arXiv Detail & Related papers (2021-10-25T14:09:45Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Adversarial Robustness by Design through Analog Computing and Synthetic
Gradients [80.60080084042666]
We propose a new defense mechanism against adversarial attacks inspired by an optical co-processor.
In the white-box setting, our defense works by obfuscating the parameters of the random projection.
We find the combination of a random projection and binarization in the optical system also improves robustness against various types of black-box attacks.
arXiv Detail & Related papers (2021-01-06T16:15:29Z) - Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness [97.67477497115163]
We use mode connectivity to study the adversarial robustness of deep neural networks.
Our experiments cover various types of adversarial attacks applied to different network architectures and datasets.
Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
arXiv Detail & Related papers (2020-04-30T19:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.