Optimizing the Adversarial Perturbation with a Momentum-based Adaptive Matrix
- URL: http://arxiv.org/abs/2512.14188v1
- Date: Tue, 16 Dec 2025 08:35:18 GMT
- Title: Optimizing the Adversarial Perturbation with a Momentum-based Adaptive Matrix
- Authors: Wei Tao, Sheng Long, Xin Liu, Wei Li, Qing Tao,
- Abstract summary: We present a novel momentum-based attack AdaMI, in which the perturbation is optimized with an interesting momentum-based adaptive matrix.<n>AdaMI is proved to attain optimal convergence for convex problems, indicating that it addresses the non-convergence issue of MI-FGSM.
- Score: 13.862664606369014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating adversarial examples (AEs) can be formulated as an optimization problem. Among various optimization-based attacks, the gradient-based PGD and the momentum-based MI-FGSM have garnered considerable interest. However, all these attacks use the sign function to scale their perturbations, which raises several theoretical concerns from the point of view of optimization. In this paper, we first reveal that PGD is actually a specific reformulation of the projected gradient method using only the current gradient to determine its step-size. Further, we show that when we utilize a conventional adaptive matrix with the accumulated gradients to scale the perturbation, PGD becomes AdaGrad. Motivated by this analysis, we present a novel momentum-based attack AdaMI, in which the perturbation is optimized with an interesting momentum-based adaptive matrix. AdaMI is proved to attain optimal convergence for convex problems, indicating that it addresses the non-convergence issue of MI-FGSM, thereby ensuring stability of the optimization process. The experiments demonstrate that the proposed momentum-based adaptive matrix can serve as a general and effective technique to boost adversarial transferability over the state-of-the-art methods across different networks while maintaining better stability and imperceptibility.
Related papers
- The Power of Decaying Steps: Enhancing Attack Stability and Transferability for Sign-based Optimizers [9.020853139493239]
We argue that one cause for non-convergence and instability is their non-decaying step-size scheduling.<n>We propose a series of new attack algorithms that enforce Monotonically Decreasing Coordinate-wise Step-sizes within sign-based adversarials.
arXiv Detail & Related papers (2026-02-22T08:37:06Z) - ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations [54.886931928255564]
Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
arXiv Detail & Related papers (2026-02-07T10:19:36Z) - MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization [56.074760766965085]
Group-Relative Policy Optimization has emerged as an efficient paradigm for aligning Large Language Models (LLMs)<n>We propose MAESTRO, which treats reward scalarization as a dynamic latent policy, leveraging the model's terminal hidden states as a semantic bottleneck.<n>We formulate this as a contextual bandit problem within a bi-level optimization framework, where a lightweight Conductor network co-evolves with the policy by utilizing group-relative advantages as a meta-reward signal.
arXiv Detail & Related papers (2026-01-12T05:02:48Z) - Learning Dynamic Representations via An Optimally-Weighted Maximum Mean Discrepancy Optimization Framework for Continual Learning [16.10753846850319]
Continual learning allows models to persistently acquire and retain information.<n> catastrophic forgetting can severely impair model performance.<n>We introduce a novel framework termed Optimally-Weighted Mean Discrepancy (OWMMD), which imposes penalties on representation alterations.
arXiv Detail & Related papers (2025-01-21T13:33:45Z) - A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Implicit Bias and Fast Convergence Rates for Self-attention [26.766649949420746]
We study the fundamental optimization principles of self-attention, the defining mechanism of transformers.<n>We analyze the implicit bias of gradient-baseds in a self-attention layer with a decoder in a linear classification.
arXiv Detail & Related papers (2024-02-08T15:15:09Z) - Stochastic Optimal Control Matching [53.156277491861985]
Our work introduces Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for optimal control.
The control is learned via a least squares problem by trying to fit a matching vector field.
Experimentally, our algorithm achieves lower error than all the existing IDO techniques for optimal control.
arXiv Detail & Related papers (2023-12-04T16:49:43Z) - Adapting Step-size: A Unified Perspective to Analyze and Improve
Gradient-based Methods for Adversarial Attacks [21.16546620434816]
We provide a unified theoretical interpretation of gradient-based adversarial learning methods.
We show that each of these algorithms is in fact a specific reformulation of their original gradient methods.
We present a broad class of adaptive gradient-based algorithms based on the regular gradient methods.
arXiv Detail & Related papers (2023-01-27T06:17:51Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Analysis of Dual-Based PID Controllers through Convolutional Mirror
Descent [20.512667802427675]
This paper provides the first regret bounds on the performance of dual-based PID controllers for online allocation problems.
We establish a fundamental connection between dual-based PID controllers and a new first-order algorithm for online convex optimization called emphConvolutional Mirror Descent (CMD)
We provide the first regret bound for CMD for non-smooth convex optimization, which might be of independent interest.
arXiv Detail & Related papers (2022-02-12T21:55:53Z) - EOS: a Parallel, Self-Adaptive, Multi-Population Evolutionary Algorithm
for Constrained Global Optimization [68.8204255655161]
EOS is a global optimization algorithm for constrained and unconstrained problems of real-valued variables.
It implements a number of improvements to the well-known Differential Evolution (DE) algorithm.
Results prove that EOSis capable of achieving increased performance compared to state-of-the-art single-population self-adaptive DE algorithms.
arXiv Detail & Related papers (2020-07-09T10:19:22Z) - A Dynamical Systems Approach for Convergence of the Bayesian EM
Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based.
The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM)
We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z) - An adaptive stochastic gradient-free approach for high-dimensional
blackbox optimization [0.0]
We propose an adaptive gradient-free (ASGF) approach for high-dimensional non-smoothing problems.
We illustrate the performance of this method on benchmark global problems and learning tasks.
arXiv Detail & Related papers (2020-06-18T22:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.