Fast Gradient Non-sign Methods
- URL: http://arxiv.org/abs/2110.12734v1
- Date: Mon, 25 Oct 2021 08:46:00 GMT
- Title: Fast Gradient Non-sign Methods
- Authors: Yaya Cheng, Xiaosu Zhu, Qilong Zhang, Lianli Gao, Jingkuan Song
- Abstract summary: Fast Gradient Non-sign Method (FGNM) is a general routine, which can seamlessly replace the conventional $sign$ operation in gradient-based attacks.
Our methods outperform them by textbf27.5% at most and textbf9.5% on average.
- Score: 67.56549792690706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks make their success in \enquote{fooling} DNNs and among
them, gradient-based algorithms become one of the mainstreams. Based on the
linearity hypothesis~\cite{fgsm}, under $\ell_\infty$ constraint, $sign$
operation applied to the gradients is a good choice for generating
perturbations. However, the side-effect from such operation exists since it
leads to the bias of direction between the real gradients and the
perturbations. In other words, current methods contain a gap between real
gradients and actual noises, which leads to biased and inefficient attacks.
Therefore in this paper, based on the Taylor expansion, the bias is analyzed
theoretically and the correction of $\sign$, \ie, Fast Gradient Non-sign Method
(FGNM), is further proposed. Notably, FGNM is a general routine, which can
seamlessly replace the conventional $sign$ operation in gradient-based attacks
with negligible extra computational cost. Extensive experiments demonstrate the
effectiveness of our methods. Specifically, ours outperform them by
\textbf{27.5\%} at most and \textbf{9.5\%} on average. Our anonymous code is
publicly available: \url{https://git.io/mm-fgnm}.
Related papers
- Novel Gradient Sparsification Algorithm via Bayesian Inference [27.246907664193156]
This paper proposes a novel sparsification algorithm called regularized Top-$k$ (RegTop-$k$) that controls the learning rate scaling of error accumulation.
Numerical experiments with ResNet-18 on CIFAR-10 show that RegTop-$k$ achieves about $8%$ higher accuracy than standard Top-$k$.
arXiv Detail & Related papers (2024-09-23T10:42:34Z) - Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance.
We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z) - Generalization Bounds for Gradient Methods via Discrete and Continuous
Prior [8.76346911214414]
We show a new high probability generalization bound of order $O(frac1n + fracL2n2sum_t=1T(gamma_t/varepsilon_t)2)$ for gradient Langevin Dynamics (GLD)
We can also obtain new bounds for certain variants of SGD.
arXiv Detail & Related papers (2022-05-27T07:23:01Z) - ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication
Acceleration! Finally! [6.170898159041277]
We introduce ProxSkip, a surprisingly simple and provably efficient method for minimizing the sum of a smooth ($f$) and an expensive nonsmooth ($psi$) function.
Our main motivation comes from federated learning, where evaluation of the gradient operator corresponds to taking a local GD step independently on all devices.
arXiv Detail & Related papers (2022-02-18T18:56:05Z) - Restarted Nonconvex Accelerated Gradient Descent: No More
Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity [70.65867695317633]
We propose two simple accelerated gradient methods, restarted gradient descent (AGD) and restarted ball (HB) method.
We establish that our methods achieve an $frac1epsilon)$ number of gradient iterations.
Our algorithms are simple in the sense that they only consist of Nestov's classical AGD orak's HB as well as a restart mechanism.
arXiv Detail & Related papers (2022-01-27T10:04:04Z) - Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot.
We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks.
Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - The Geometry of Sign Gradient Descent [29.8753797565422]
We show a close connection between separable smoothness and $ell_infty$-smoothness and argue that the latter is the weaker and more natural assumption.
We then proceed to study the smoothness constant with respect to the $ell_infty$-norm and thereby isolate geometric properties of the objective function.
In short, we find sign-based methods to be preferable over gradient descent if (i) the Hessian is to some degree concentrated on its diagonal, and (ii) its maximal eigenvalue is much larger than the average eigenvalue.
arXiv Detail & Related papers (2020-02-19T08:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.