A Review of Adversarial Attack and Defense for Classification Methods
- URL: http://arxiv.org/abs/2111.09961v1
- Date: Thu, 18 Nov 2021 22:13:43 GMT
- Title: A Review of Adversarial Attack and Defense for Classification Methods
- Authors: Yao Li, Minhao Cheng, Cho-Jui Hsieh, Thomas C. M. Lee
- Abstract summary: This paper focuses on the generation and guarding of adversarial examples.
It is the hope of the authors that this paper will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.
- Score: 78.50824774203495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the efficiency and scalability of machine learning systems, recent
studies have demonstrated that many classification methods, especially deep
neural networks (DNNs), are vulnerable to adversarial examples; i.e., examples
that are carefully crafted to fool a well-trained classification model while
being indistinguishable from natural data to human. This makes it potentially
unsafe to apply DNNs or related methods in security-critical areas. Since this
issue was first identified by Biggio et al. (2013) and Szegedy et al.(2014),
much work has been done in this field, including the development of attack
methods to generate adversarial examples and the construction of defense
techniques to guard against such examples. This paper aims to introduce this
topic and its latest developments to the statistical community, primarily
focusing on the generation and guarding of adversarial examples. Computing
codes (in python and R) used in the numerical experiments are publicly
available for readers to explore the surveyed methods. It is the hope of the
authors that this paper will encourage more statisticians to work on this
important and exciting field of generating and defending against adversarial
examples.
Related papers
- Towards Evaluating Transfer-based Attacks Systematically, Practically,
and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention.
An increasing number of transfer-based methods have been developed to fool black-box DNN models.
We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z) - It Is All About Data: A Survey on the Effects of Data on Adversarial
Robustness [4.1310970179750015]
Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to confuse the model into making a mistake.
To address this problem, the area of adversarial robustness investigates mechanisms behind adversarial attacks and defenses against these attacks.
arXiv Detail & Related papers (2023-03-17T04:18:03Z) - Poisoning Attacks and Defenses on Artificial Intelligence: A Survey [3.706481388415728]
Data poisoning attacks represent a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase.
This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks.
A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions.
arXiv Detail & Related papers (2022-02-21T14:43:38Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models.
We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Adversarial Machine Learning in Image Classification: A Survey Towards
the Defender's Perspective [1.933681537640272]
Adversarial examples are images containing subtle perturbations generated by malicious optimization algorithms.
Deep Learning algorithms have been used in security-critical applications, such as biometric recognition systems and self-driving cars.
arXiv Detail & Related papers (2020-09-08T13:21:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.