A Closer Look at the Adversarial Robustness of Deep Equilibrium Models
- URL: http://arxiv.org/abs/2306.01429v1
- Date: Fri, 2 Jun 2023 10:40:30 GMT
- Title: A Closer Look at the Adversarial Robustness of Deep Equilibrium Models
- Authors: Zonghan Yang, Tianyu Pang, Yang Liu
- Abstract summary: We develop approaches to estimate the intermediate gradients of DEQs and integrate them into the attacking pipelines.
Our approaches facilitate fully white-box evaluations and lead to effective adversarial defense for DEQs.
- Score: 25.787638780625514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep equilibrium models (DEQs) refrain from the traditional layer-stacking
paradigm and turn to find the fixed point of a single layer. DEQs have achieved
promising performance on different applications with featured memory
efficiency. At the same time, the adversarial vulnerability of DEQs raises
concerns. Several works propose to certify robustness for monotone DEQs.
However, limited efforts are devoted to studying empirical robustness for
general DEQs. To this end, we observe that an adversarially trained DEQ
requires more forward steps to arrive at the equilibrium state, or even
violates its fixed-point structure. Besides, the forward and backward tracks of
DEQs are misaligned due to the black-box solvers. These facts cause gradient
obfuscation when applying the ready-made attacks to evaluate or adversarially
train DEQs. Given this, we develop approaches to estimate the intermediate
gradients of DEQs and integrate them into the attacking pipelines. Our
approaches facilitate fully white-box evaluations and lead to effective
adversarial defense for DEQs. Extensive experiments on CIFAR-10 validate the
adversarial robustness of DEQs competitive with deep networks of similar sizes.
Related papers
- Adversarial Robustness Overestimation and Instability in TRADES [4.063518154926961]
TRADES sometimes yields disproportionately high PGD validation accuracy compared to the AutoAttack testing accuracy in the multiclass classification task.
This discrepancy highlights a significant overestimation of robustness for these instances, potentially linked to gradient masking.
arXiv Detail & Related papers (2024-10-10T07:32:40Z) - STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario [50.37501379058119]
We propose the Spatial Transform Black-box Attack (STBA) to craft formidable adversarial examples in the query-limited scenario.
We show that STBA could effectively improve the imperceptibility of the adversarial examples and remarkably boost the attack success rate under query-limited settings.
arXiv Detail & Related papers (2024-03-30T13:28:53Z) - Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization [18.95463890154886]
No-Reference Image Quality Assessment (NR-IQA) models play a crucial role in the media industry.
These models are found to be vulnerable to adversarial attacks, which introduce imperceptible perturbations to input images.
We propose a defense method to improve the stability in predicted scores when attacked by small perturbations.
arXiv Detail & Related papers (2024-03-18T01:11:53Z) - RECESS Vaccine for Federated Learning: Proactive Defense Against Model Poisoning Attacks [20.55681622921858]
Model poisoning attacks greatly jeopardize the application of federated learning (FL)
In this work, we propose a novel proactive defense named RECESS against model poisoning attacks.
Unlike previous methods that score each iteration, RECESS considers clients' performance correlation across multiple iterations to estimate the trust score.
arXiv Detail & Related papers (2023-10-09T06:09:01Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial
Robustness [75.30116479840619]
In this paper, we identify a more subtle situation called Imbalanced Gradients that can also cause overestimated adversarial robustness.
The phenomenon of imbalanced gradients occurs when the gradient of one term of the margin loss dominates and pushes the attack towards a suboptimal direction.
We propose a Margin Decomposition (MD) attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately.
arXiv Detail & Related papers (2020-06-24T13:41:37Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.