Related papers: Defenses in Adversarial Machine Learning: A Survey

Defenses in Adversarial Machine Learning: A Survey

URL: http://arxiv.org/abs/2312.08890v1
Date: Wed, 13 Dec 2023 15:42:55 GMT
Title: Defenses in Adversarial Machine Learning: A Survey
Authors: Baoyuan Wu, Shaokui Wei, Mingli Zhu, Meixi Zheng, Zihao Zhu, Mingda Zhang, Hongrui Chen, Danni Yuan, Li Liu, Qingshan Liu
Abstract summary: Adversarial phenomenon has been widely observed in machine learning (ML) systems, especially in those using deep neural networks. Several advanced attack paradigms have been developed to explore it, mainly including backdoor attacks, weight attacks, and adversarial examples. Various defense paradigms have been developed to improve the model robustness against the corresponding attack paradigm. This survey aims to build a systematic review of all existing defense paradigms from a unified perspective.
Score: 46.41995115842852
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial phenomenon has been widely observed in machine learning (ML) systems, especially in those using deep neural networks, describing that ML systems may produce inconsistent and incomprehensible predictions with humans at some particular cases. This phenomenon poses a serious security threat to the practical application of ML systems, and several advanced attack paradigms have been developed to explore it, mainly including backdoor attacks, weight attacks, and adversarial examples. For each individual attack paradigm, various defense paradigms have been developed to improve the model robustness against the corresponding attack paradigm. However, due to the independence and diversity of these defense paradigms, it is difficult to examine the overall robustness of an ML system against different kinds of attacks.This survey aims to build a systematic review of all existing defense paradigms from a unified perspective. Specifically, from the life-cycle perspective, we factorize a complete machine learning system into five stages, including pre-training, training, post-training, deployment, and inference stages, respectively. Then, we present a clear taxonomy to categorize and review representative defense methods at each individual stage. The unified perspective and presented taxonomies not only facilitate the analysis of the mechanism of each defense paradigm but also help us to understand connections and differences among different defense paradigms, which may inspire future research to develop more advanced, comprehensive defenses.

Related papers

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives [65.3369988566853]
Recent studies have demonstrated that adversaries can replicate a target model's functionality.<n>Model Extraction Attacks pose threats to intellectual property, privacy, and system security.<n>We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments.
arXiv Detail & Related papers (2025-08-20T19:49:59Z)
A Survey on Model Extraction Attacks and Defenses for Large Language Models [55.60375624503877]
Model extraction attacks pose significant security threats to deployed language models.<n>This survey provides a comprehensive taxonomy of extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks.<n>We examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios.
arXiv Detail & Related papers (2025-06-26T22:02:01Z)
A Survey of Adversarial Defenses in Vision-based Systems: Categorization, Methods and Challenges [4.716918459551686]
Adversarial attacks have emerged as a major challenge to the trustworthy deployment of machine learning models. We present a comprehensive systematization of knowledge on adversarial defenses, focusing on two key computer vision tasks. We map these defenses to the types of adversarial attacks and datasets where they are most effective.
arXiv Detail & Related papers (2025-03-01T07:17:18Z)
A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments [55.60375624503877]
Model Extraction Attacks (MEAs) threaten modern machine learning systems by enabling adversaries to steal models, exposing intellectual property and training data. This survey is motivated by the urgent need to understand how the unique characteristics of cloud, edge, and federated deployments shape attack vectors and defense requirements. We systematically examine the evolution of attack methodologies and defense mechanisms across these environments, demonstrating how environmental factors influence security strategies in critical sectors such as autonomous vehicles, healthcare, and financial services.
arXiv Detail & Related papers (2025-02-22T03:46:50Z)
Model Privacy: A Unified Framework to Understand Model Stealing Attacks and Defenses [11.939472526374246]
This work presents a framework called Model Privacy'', providing a foundation for comprehensively analyzing model stealing attacks and defenses. We propose methods to quantify the goodness of attack and defense strategies, and analyze the fundamental tradeoffs between utility and privacy in ML models.
arXiv Detail & Related papers (2025-02-21T16:29:11Z)
Sustainable Self-evolution Adversarial Training [51.25767996364584]
We propose a Sustainable Self-Evolution Adversarial Training (SSEAT) framework for adversarial training defense models. We introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples. We also propose an adversarial data replay module to better select more diverse and key relearning data.
arXiv Detail & Related papers (2024-12-03T08:41:11Z)
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey [50.031628043029244]
Multimodal generative models are susceptible to jailbreak attacks, which can bypass built-in safety mechanisms and induce the production of potentially harmful content. This survey reviews jailbreak and defense in multimodal generative models.
arXiv Detail & Related papers (2024-11-14T07:51:51Z)
Inference Attacks: A Taxonomy, Survey, and Promising Directions [44.290208239143126]
This survey provides an in-depth and comprehensive inference of attacks and corresponding countermeasures in ML-as-a-service. We first propose the 3MP taxonomy based on the community research status, trying to normalize the confusing naming system of inference attacks. Also, we analyze the pros and cons of each type of inference attack, their workflow, countermeasure, and how they interact with other attacks.
arXiv Detail & Related papers (2024-06-04T07:06:06Z)
Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective [69.25513235556635]
Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans. Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system. We propose a unified mathematical framework to covering existing attack paradigms.
arXiv Detail & Related papers (2023-02-19T02:12:21Z)
I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences [0.1031296820074812]
We study model stealing attacks, assessing their performance and exploring corresponding defence techniques in different settings. We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence based on the goal and available resources.
arXiv Detail & Related papers (2022-06-16T21:16:41Z)
A Tutorial on Adversarial Learning Attacks and Countermeasures [0.0]
A machine learning model is capable of making highly accurate predictions without being explicitly programmed to do so. adversarial learning attacks pose a serious security threat that greatly undermines further such systems. This paper provides a detailed tutorial on the principles of adversarial learning, explains the different attack scenarios, and gives an in-depth insight into the state-of-art defense mechanisms against this rising threat.
arXiv Detail & Related papers (2022-02-21T17:14:45Z)
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples. Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z)
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc. We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z)
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.