Related papers: Beyond Deceptive Flatness: Dual-Order Solution for Strengthening Adversarial Transferability

Beyond Deceptive Flatness: Dual-Order Solution for Strengthening Adversarial Transferability

URL: http://arxiv.org/abs/2511.01240v1
Date: Mon, 03 Nov 2025 05:26:43 GMT
Title: Beyond Deceptive Flatness: Dual-Order Solution for Strengthening Adversarial Transferability
Authors: Zhixuan Zhang, Pingyu Wang, Xingjian Zheng, Linbo Qing, Qi Liu,
Abstract summary: Transferable attacks generate adversarial examples on surrogate models to fool unknown victim models.<n>Recent studies still fall into suboptimal regions, especially the flat-yet-sharp areas, termed as deceptive flatness.<n>We introduce a novel black-box gradient-based transferable attack from a perspective of dual-order information.
Score: 15.709841683184415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transferable attacks generate adversarial examples on surrogate models to fool unknown victim models, posing real-world threats and growing research interest. Despite focusing on flat losses for transferable adversarial examples, recent studies still fall into suboptimal regions, especially the flat-yet-sharp areas, termed as deceptive flatness. In this paper, we introduce a novel black-box gradient-based transferable attack from a perspective of dual-order information. Specifically, we feasibly propose Adversarial Flatness (AF) to the deceptive flatness problem and a theoretical assurance for adversarial transferability. Based on this, using an efficient approximation of our objective, we instantiate our attack as Adversarial Flatness Attack (AFA), addressing the altered gradient sign issue. Additionally, to further improve the attack ability, we devise MonteCarlo Adversarial Sampling (MCAS) by enhancing the inner-loop sampling efficiency. The comprehensive results on ImageNet-compatible dataset demonstrate superiority over six baselines, generating adversarial examples in flatter regions and boosting transferability across model architectures. When tested on input transformation attacks or the Baidu Cloud API, our method outperforms baselines.

Related papers

AIM: Additional Image Guided Generation of Transferable Adversarial Attacks [72.24101555828256]
Transferable adversarial examples highlight the vulnerability of deep neural networks (DNNs) to imperceptible perturbations across various real-world applications.<n>In this work, we focus on generative approaches for targeted transferable attacks.<n>We introduce a novel plug-and-play module into the general generator architecture to enhance adversarial transferability.
arXiv Detail & Related papers (2025-01-02T07:06:49Z)
Improving Adversarial Transferability with Neighbourhood Gradient Information [20.55829486744819]
Deep neural networks (DNNs) are susceptible to adversarial examples, leading to significant performance degradation. This work focuses on enhancing the transferability of adversarial examples to narrow this performance gap. We propose the NGI-Attack, which incorporates Example Backtracking and Multiplex Mask strategies.
arXiv Detail & Related papers (2024-08-11T10:46:49Z)
Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds [62.94859179323329]
Adrial attack methods based on point manipulation for 3D point cloud classification have revealed the fragility of 3D models. We propose a novel shape-based adversarial attack method, HiT-ADV, which conducts a two-stage search for attack regions based on saliency and imperceptibility perturbation scores. We propose that by employing benign resampling and benign rigid transformations, we can further enhance physical adversarial strength with little sacrifice to imperceptibility.
arXiv Detail & Related papers (2024-03-08T12:08:06Z)
Transferability Bound Theory: Exploring Relationship between Adversarial Transferability and Flatness [40.873711834682055]
A prevailing belief is that the higher flatness of adversarial examples enables their better cross-model transferability. We propose TPA, a Theoretically Provable Attack that optimize a surrogate of the derived bound to craft adversarial examples.
arXiv Detail & Related papers (2023-11-10T23:10:21Z)
Transferable Attack for Semantic Segmentation [59.17710830038692]
adversarial attacks, and observe that the adversarial examples generated from a source model fail to attack the target models. We propose an ensemble attack for semantic segmentation to achieve more effective attacks with higher transferability.
arXiv Detail & Related papers (2023-07-31T11:05:55Z)
Rethinking the Backward Propagation for Adversarial Transferability [12.244490573612286]
Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access. In this work, we identify that non-linear layers truncate the gradient during backward propagation, making the gradient w.r.t. input image imprecise to the loss function. We propose a novel method to increase the relevance between the gradient w.r.t. input image and loss function so as to generate adversarial examples with higher transferability.
arXiv Detail & Related papers (2023-06-22T06:12:23Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
StyLess: Boosting the Transferability of Adversarial Examples [10.607781970035083]
Adversarial attacks can mislead deep neural networks (DNNs) by adding imperceptible perturbations to benign examples. We propose a novel attack method called style-less perturbation (StyLess) to improve attack transferability.
arXiv Detail & Related papers (2023-04-23T08:23:48Z)
Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation [32.81400759291457]
adversarial examples can produce erroneous predictions by injecting imperceptible perturbations. In this work, we study the transferability of adversarial examples, which is significant due to its threat to real-world applications. We propose a novel attack method, dubbed reverse adversarial perturbation (RAP)
arXiv Detail & Related papers (2022-10-12T07:17:33Z)
Enhancing the Self-Universality for Transferable Targeted Attacks [88.6081640779354]
Our new attack method is proposed based on the observation that highly universal adversarial perturbations tend to be more transferable for targeted attacks. Instead of optimizing the perturbations on different images, optimizing on different regions to achieve self-universality can get rid of using extra data. With the feature similarity loss, our method makes the features from adversarial perturbations to be more dominant than that of benign images.
arXiv Detail & Related papers (2022-09-08T11:21:26Z)
What Does the Gradient Tell When Attacking the Graph Structure [44.44204591087092]
We present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs. By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous. We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility.
arXiv Detail & Related papers (2022-08-26T15:45:20Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios. We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.