Defenses in Adversarial Machine Learning: A Survey
- URL: http://arxiv.org/abs/2312.08890v1
- Date: Wed, 13 Dec 2023 15:42:55 GMT
- Title: Defenses in Adversarial Machine Learning: A Survey
- Authors: Baoyuan Wu, Shaokui Wei, Mingli Zhu, Meixi Zheng, Zihao Zhu, Mingda
Zhang, Hongrui Chen, Danni Yuan, Li Liu, Qingshan Liu
- Abstract summary: Adversarial phenomenon has been widely observed in machine learning (ML) systems, especially in those using deep neural networks.
Several advanced attack paradigms have been developed to explore it, mainly including backdoor attacks, weight attacks, and adversarial examples.
Various defense paradigms have been developed to improve the model robustness against the corresponding attack paradigm.
This survey aims to build a systematic review of all existing defense paradigms from a unified perspective.
- Score: 46.41995115842852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial phenomenon has been widely observed in machine learning (ML)
systems, especially in those using deep neural networks, describing that ML
systems may produce inconsistent and incomprehensible predictions with humans
at some particular cases. This phenomenon poses a serious security threat to
the practical application of ML systems, and several advanced attack paradigms
have been developed to explore it, mainly including backdoor attacks, weight
attacks, and adversarial examples. For each individual attack paradigm, various
defense paradigms have been developed to improve the model robustness against
the corresponding attack paradigm. However, due to the independence and
diversity of these defense paradigms, it is difficult to examine the overall
robustness of an ML system against different kinds of attacks.This survey aims
to build a systematic review of all existing defense paradigms from a unified
perspective. Specifically, from the life-cycle perspective, we factorize a
complete machine learning system into five stages, including pre-training,
training, post-training, deployment, and inference stages, respectively. Then,
we present a clear taxonomy to categorize and review representative defense
methods at each individual stage. The unified perspective and presented
taxonomies not only facilitate the analysis of the mechanism of each defense
paradigm but also help us to understand connections and differences among
different defense paradigms, which may inspire future research to develop more
advanced, comprehensive defenses.
Related papers
- Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey [50.031628043029244]
Multimodal generative models are susceptible to jailbreak attacks, which can bypass built-in safety mechanisms and induce the production of potentially harmful content.
This survey reviews jailbreak and defense in multimodal generative models.
arXiv Detail & Related papers (2024-11-14T07:51:51Z) - Inference Attacks: A Taxonomy, Survey, and Promising Directions [44.290208239143126]
This survey provides an in-depth and comprehensive inference of attacks and corresponding countermeasures in ML-as-a-service.
We first propose the 3MP taxonomy based on the community research status, trying to normalize the confusing naming system of inference attacks.
Also, we analyze the pros and cons of each type of inference attack, their workflow, countermeasure, and how they interact with other attacks.
arXiv Detail & Related papers (2024-06-04T07:06:06Z) - Attacks in Adversarial Machine Learning: A Systematic Survey from the
Life-cycle Perspective [69.25513235556635]
Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans.
Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system.
We propose a unified mathematical framework to covering existing attack paradigms.
arXiv Detail & Related papers (2023-02-19T02:12:21Z) - I Know What You Trained Last Summer: A Survey on Stealing Machine
Learning Models and Defences [0.1031296820074812]
We study model stealing attacks, assessing their performance and exploring corresponding defence techniques in different settings.
We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence based on the goal and available resources.
arXiv Detail & Related papers (2022-06-16T21:16:41Z) - A Tutorial on Adversarial Learning Attacks and Countermeasures [0.0]
A machine learning model is capable of making highly accurate predictions without being explicitly programmed to do so.
adversarial learning attacks pose a serious security threat that greatly undermines further such systems.
This paper provides a detailed tutorial on the principles of adversarial learning, explains the different attack scenarios, and gives an in-depth insight into the state-of-art defense mechanisms against this rising threat.
arXiv Detail & Related papers (2022-02-21T17:14:45Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.