Boosting Adversarial Transferability with Low-Cost Optimization via Maximin Expected Flatness
- URL: http://arxiv.org/abs/2405.16181v2
- Date: Fri, 01 Aug 2025 04:16:04 GMT
- Title: Boosting Adversarial Transferability with Low-Cost Optimization via Maximin Expected Flatness
- Authors: Chunlin Qiu, Ang Li, Yiheng Duan, Shenyi Zhang, Yuanjie Zhang, Lingchen Zhao, Qian Wang,
- Abstract summary: Transfer-based attacks craft adversarial examples on surrogate models and deploy them against black-box target models, offering model-agnostic and query-free threat scenarios.<n>While flatness-enhanced methods have recently emerged to improve transferability, their divergent flatness definitions and attack designs suffer from unexamined optimization limitations and missing theoretical foundation.<n>This work exposes the severely imbalanced exploitation-exploration dynamics in flatness optimization, establishing the first theoretical foundation for flatness-based transferability.
- Score: 12.70766510980057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer-based attacks craft adversarial examples on white-box surrogate models and directly deploy them against black-box target models, offering model-agnostic and query-free threat scenarios. While flatness-enhanced methods have recently emerged to improve transferability by enhancing the loss surface flatness of adversarial examples, their divergent flatness definitions and heuristic attack designs suffer from unexamined optimization limitations and missing theoretical foundation, thus constraining their effectiveness and efficiency. This work exposes the severely imbalanced exploitation-exploration dynamics in flatness optimization, establishing the first theoretical foundation for flatness-based transferability and proposing a principled framework to overcome these optimization pitfalls. Specifically, we systematically unify fragmented flatness definitions across existing methods, revealing their imbalanced optimization limitations in over-exploration of sensitivity peaks or over-exploitation of local plateaus. To resolve these issues, we rigorously formalize average-case flatness and transferability gaps, proving that enhancing zeroth-order average-case flatness minimizes cross-model discrepancies. Building on this theory, we design a Maximin Expected Flatness (MEF) attack that enhances zeroth-order average-case flatness while balancing flatness exploration and exploitation. Extensive evaluations across 22 models and 24 current transfer-based attacks demonstrate MEF's superiority: it surpasses the state-of-the-art PGN attack by 4% in attack success rate at half the computational cost and achieves 8% higher success rate under the same budget. When combined with input augmentation, MEF attains 15% additional gains against defense-equipped models, establishing new robustness benchmarks. Our code is available at https://github.com/SignedQiu/MEFAttack.
Related papers
- Divergence Minimization Preference Optimization for Diffusion Model Alignment [58.651951388346525]
Divergence Minimization Preference Optimization (DMPO) is a principled method for aligning diffusion models by minimizing reverse KL divergence.<n>Our results show that diffusion models fine-tuned with DMPO can consistently outperform or match existing techniques.<n>DMPO unlocks a robust and elegant pathway for preference alignment, bridging principled theory with practical performance in diffusion models.
arXiv Detail & Related papers (2025-07-10T07:57:30Z) - MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z) - Seeking Flat Minima over Diverse Surrogates for Improved Adversarial Transferability: A Theoretical Framework and Algorithmic Instantiation [38.12499933796839]
We propose a novel transferability bound that offers provable guarantees for adversarial transferability.<n>Our theoretical results demonstrate that optimizing AEs toward flat minima over the surrogate model set, while controlling the surrogate-target model shift measured by the adversarial model discrepancy, yields a comprehensive guarantee for AE transferability.
arXiv Detail & Related papers (2025-04-23T07:33:45Z) - Understanding Flatness in Generative Models: Its Role and Benefits [9.775257597631244]
Flat minima, known to enhance robustness in supervised learning, remain largely unexplored in generative models.<n>We establish a theoretical claim that flatter minima improve robustness against perturbations in target prior distributions.<n>We demonstrate that flat minima in diffusion models indeed improve not only generative performance but also robustness.
arXiv Detail & Related papers (2025-03-14T04:38:53Z) - Byzantine-Resilient Federated Learning via Distributed Optimization [3.2075234058213757]
Byzantine attacks present a critical challenge to Federated Learning (FL)<n>Traditional FL frameworks rely on aggregation-based protocols for model updates, leaving them vulnerable to sophisticated adversarial strategies.<n>We show that the Primal-Dual Method of Multipliers (PDMM) inherently mitigates Byzantine impacts by leveraging its fault-tolerant consensus mechanism.
arXiv Detail & Related papers (2025-03-13T18:34:42Z) - Improving the Transferability of Adversarial Attacks by an Input Transpose [13.029909541428767]
In this work, we propose an input transpose method that requires almost no additional labor and computation costs but can significantly improve the transferability of existing adversarial strategies.
Our exploration finds that on specific datasets, a mere $1circ$ left or right rotation might be sufficient for most adversarial examples to deceive unseen models.
arXiv Detail & Related papers (2025-03-02T15:13:41Z) - Improving Transferable Targeted Attacks with Feature Tuning Mixup [12.707753562907534]
Deep neural networks exhibit vulnerability to examples that can transfer across different models.
We propose Feature Tuning Mixup (FTM) to enhance targeted attack transferability.
Our method achieves significant improvements over state-of-the-art methods while maintaining low computational cost.
arXiv Detail & Related papers (2024-11-23T13:18:25Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario [50.37501379058119]
We propose the Spatial Transform Black-box Attack (STBA) to craft formidable adversarial examples in the query-limited scenario.
We show that STBA could effectively improve the imperceptibility of the adversarial examples and remarkably boost the attack success rate under query-limited settings.
arXiv Detail & Related papers (2024-03-30T13:28:53Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters.
Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z) - Logit Margin Matters: Improving Transferable Targeted Adversarial Attack
by Logit Calibration [85.71545080119026]
Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples.
We propose two simple and effective logit calibration methods, which are achieved by downscaling the logits with a temperature factor and an adaptive margin.
Experiments conducted on the ImageNet dataset validate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2023-03-07T06:42:52Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - Improving Adversarial Transferability with Scheduled Step Size and Dual
Example [33.00528131208799]
We show that transferability of adversarial examples generated by the iterative fast gradient sign method exhibits a decreasing trend when increasing the number of iterations.
We propose a novel strategy, which uses the Scheduled step size and the Dual example (SD) to fully utilize the adversarial information near the benign sample.
Our proposed strategy can be easily integrated with existing adversarial attack methods for better adversarial transferability.
arXiv Detail & Related papers (2023-01-30T15:13:46Z) - Transfer Attacks Revisited: A Large-Scale Empirical Study in Real
Computer Vision Settings [64.37621685052571]
We conduct the first systematic empirical study of transfer attacks against major cloud-based ML platforms.
The study leads to a number of interesting findings which are inconsistent to the existing ones.
We believe this work sheds light on the vulnerabilities of popular ML platforms and points to a few promising research directions.
arXiv Detail & Related papers (2022-04-07T12:16:24Z) - Boosting Transferability of Targeted Adversarial Examples via
Hierarchical Generative Networks [56.96241557830253]
Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting.
We propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes.
Our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods.
arXiv Detail & Related papers (2021-07-05T06:17:47Z) - TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity
and Model Smoothness [14.342349428248887]
Adversarial Transferability is an intriguing property of adversarial examples.
This paper theoretically analyzes sufficient conditions for transferability between models.
We propose a practical algorithm to reduce transferability within an ensemble to improve its robustness.
arXiv Detail & Related papers (2021-04-01T17:58:35Z) - Defense-guided Transferable Adversarial Attacks [5.2206166148727835]
adversarial examples are hard to transfer to unknown models.
We design a max-min framework inspired by input transformations, which are benificial to both the adversarial attack and defense.
Our method is expected to be a benchmark for assessing the robustness of deep models.
arXiv Detail & Related papers (2020-10-22T08:51:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.