Benchmarking Transferable Adversarial Attacks
- URL: http://arxiv.org/abs/2402.00418v3
- Date: Fri, 16 Feb 2024 08:06:42 GMT
- Title: Benchmarking Transferable Adversarial Attacks
- Authors: Zhibo Jin, Jiayu Zhang, Zhiyu Zhu, Huaming Chen
- Abstract summary: The robustness of deep learning models against adversarial attacks remains a pivotal concern.
This study systematically categorizes and critically evaluates various methodologies developed to augment the transferability of adversarial attacks.
- Score: 6.898135768312255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The robustness of deep learning models against adversarial attacks remains a
pivotal concern. This study presents, for the first time, an exhaustive review
of the transferability aspect of adversarial attacks. It systematically
categorizes and critically evaluates various methodologies developed to augment
the transferability of adversarial attacks. This study encompasses a spectrum
of techniques, including Generative Structure, Semantic Similarity, Gradient
Editing, Target Modification, and Ensemble Approach. Concurrently, this paper
introduces a benchmark framework \textit{TAA-Bench}, integrating ten leading
methodologies for adversarial attack transferability, thereby providing a
standardized and systematic platform for comparative analysis across diverse
model architectures. Through comprehensive scrutiny, we delineate the efficacy
and constraints of each method, shedding light on their underlying operational
principles and practical utility. This review endeavors to be a quintessential
resource for both scholars and practitioners in the field, charting the complex
terrain of adversarial transferability and setting a foundation for future
explorations in this vital sector. The associated codebase is accessible at:
https://github.com/KxPlaug/TAA-Bench
Related papers
- Toward Realistic Adversarial Attacks in IDS: A Novel Feasibility Metric for Transferability [0.0]
Transferability-based adversarial attacks exploit the ability of adversarial examples to deceive a specific source Intrusion Detection System (IDS) model.
These attacks exploit common vulnerabilities in machine learning models to bypass security measures and compromise systems.
This paper analyzes the core factors that contribute to transferability, including feature alignment, model architectural similarity, and overlap in the data distributions that each IDS examines.
arXiv Detail & Related papers (2025-04-11T12:15:03Z) - On Transfer-based Universal Attacks in Pure Black-box Setting [94.92884394009288]
We study the role of prior knowledge of the target model data and number of classes in attack performance.
We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
arXiv Detail & Related papers (2025-04-11T10:41:20Z) - A Survey of Adversarial Defenses in Vision-based Systems: Categorization, Methods and Challenges [4.716918459551686]
Adversarial attacks have emerged as a major challenge to the trustworthy deployment of machine learning models.
We present a comprehensive systematization of knowledge on adversarial defenses, focusing on two key computer vision tasks.
We map these defenses to the types of adversarial attacks and datasets where they are most effective.
arXiv Detail & Related papers (2025-03-01T07:17:18Z) - Adversarial Training for Defense Against Label Poisoning Attacks [53.893792844055106]
Label poisoning attacks pose significant risks to machine learning models.
We propose a novel adversarial training defense strategy based on support vector machines (SVMs) to counter these threats.
Our approach accommodates various model architectures and employs a projected gradient descent algorithm with kernel SVMs for adversarial training.
arXiv Detail & Related papers (2025-02-24T13:03:19Z) - Addressing Key Challenges of Adversarial Attacks and Defenses in the Tabular Domain: A Methodological Framework for Coherence and Consistency [26.645723217188323]
In this paper, we propose new evaluation criteria tailored for adversarial attacks in the tabular domain.
We also introduce a novel technique for perturbing dependent features while maintaining coherence and feature consistency within the sample.
The findings provide valuable insights on the strengths, limitations, and trade-offs of various adversarial attacks in the tabular domain.
arXiv Detail & Related papers (2024-12-10T09:17:09Z) - FEDLAD: Federated Evaluation of Deep Leakage Attacks and Defenses [50.921333548391345]
Federated Learning is a privacy preserving decentralized machine learning paradigm.
Recent research has revealed that private ground truth data can be recovered through a gradient technique known as Deep Leakage.
This paper introduces the FEDLAD Framework (Federated Evaluation of Deep Leakage Attacks and Defenses), a comprehensive benchmark for evaluating Deep Leakage attacks and defenses.
arXiv Detail & Related papers (2024-11-05T11:42:26Z) - MIBench: A Comprehensive Benchmark for Model Inversion Attack and Defense [43.71365087852274]
Model Inversion (MI) attacks aim at leveraging the output information of target models to reconstruct privacy-sensitive training data.
The lack of a comprehensive, aligned, and reliable benchmark has emerged as a formidable challenge.
We introduce the first practical benchmark for model inversion attacks and defenses to address this critical gap, which is named textitMIBench
arXiv Detail & Related papers (2024-10-07T16:13:49Z) - COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport [25.73474734479759]
This research paper introduces a novel framework based on contrastive optimal transport.
It effectively addresses the challenges of maintaining target interaction and promoting diversification in generating counter-narratives.
Our proposed model significantly outperforms current methods evaluated by metrics from multiple aspects.
arXiv Detail & Related papers (2024-06-18T06:24:26Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A
Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention.
This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques.
New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - An Intermediate-level Attack Framework on The Basis of Linear Regression [89.85593878754571]
This paper substantially extends our work published at ECCV, in which an intermediate-level attack was proposed to improve the transferability of some baseline adversarial examples.
We advocate to establish a direct linear mapping from the intermediate-level discrepancies (between adversarial features and benign features) to classification prediction loss of the adversarial example.
We show that 1) a variety of linear regression models can all be considered in order to establish the mapping, 2) the magnitude of the finally obtained intermediate-level discrepancy is linearly correlated with adversarial transferability, and 3) further boost of the performance can be achieved by performing multiple runs of the baseline attack with
arXiv Detail & Related papers (2022-03-21T03:54:53Z) - PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function [13.417003144007156]
adversarial attacks tend to rely on the principle of transferability.
Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers.
Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
arXiv Detail & Related papers (2021-12-09T14:26:13Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.