Opportunities and Challenges in Deep Learning Adversarial Robustness: A
Survey
- URL: http://arxiv.org/abs/2007.00753v2
- Date: Fri, 3 Jul 2020 20:10:20 GMT
- Title: Opportunities and Challenges in Deep Learning Adversarial Robustness: A
Survey
- Authors: Samuel Henrique Silva and Peyman Najafirad
- Abstract summary: This paper studies strategies to implement adversary robustly trained algorithms towards guaranteeing safety in machine learning algorithms.
We provide a taxonomy to classify adversarial attacks and defenses, formulate the Robust Optimization problem in a min-max setting, and divide it into 3 subcategories, namely: Adversarial (re)Training, Regularization Approach, and Certified Defenses.
- Score: 1.8782750537161614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As we seek to deploy machine learning models beyond virtual and controlled
domains, it is critical to analyze not only the accuracy or the fact that it
works most of the time, but if such a model is truly robust and reliable. This
paper studies strategies to implement adversary robustly trained algorithms
towards guaranteeing safety in machine learning algorithms. We provide a
taxonomy to classify adversarial attacks and defenses, formulate the Robust
Optimization problem in a min-max setting and divide it into 3 subcategories,
namely: Adversarial (re)Training, Regularization Approach, and Certified
Defenses. We survey the most recent and important results in adversarial
example generation, defense mechanisms with adversarial (re)Training as their
main defense against perturbations. We also survey mothods that add
regularization terms that change the behavior of the gradient, making it harder
for attackers to achieve their objective. Alternatively, we've surveyed methods
which formally derive certificates of robustness by exactly solving the
optimization problem or by approximations using upper or lower bounds. In
addition, we discuss the challenges faced by most of the recent algorithms
presenting future research perspectives.
Related papers
- Robust Image Classification: Defensive Strategies against FGSM and PGD Adversarial Attacks [0.0]
Adversarial attacks pose significant threats to the robustness of deep learning models in image classification.
This paper explores and refines defense mechanisms against these attacks to enhance the resilience of neural networks.
arXiv Detail & Related papers (2024-08-20T02:00:02Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A
Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention.
This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques.
New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z) - Probabilistic Categorical Adversarial Attack & Adversarial Training [45.458028977108256]
The existence of adversarial examples brings huge concern for people to apply Deep Neural Networks (DNNs) in safety-critical tasks.
How to generate adversarial examples with categorical data is an important problem but lack of extensive exploration.
We propose Probabilistic Categorical Adversarial Attack (PCAA), which transfers the discrete optimization problem to a continuous problem that can be solved efficiently by Projected Gradient Descent.
arXiv Detail & Related papers (2022-10-17T19:04:16Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Automated Decision-based Adversarial Attacks [48.01183253407982]
We consider the practical and challenging decision-based black-box adversarial setting.
Under this setting, the attacker can only acquire the final classification labels by querying the target model.
We propose to automatically discover decision-based adversarial attack algorithms.
arXiv Detail & Related papers (2021-05-09T13:15:10Z) - Learning to Learn from Mistakes: Robust Optimization for Adversarial
Noise [1.976652238476722]
We train a meta-optimizer which learns to robustly optimize a model using adversarial examples and is able to transfer the knowledge learned to new models.
Experimental results show the meta-optimizer is consistent across different architectures and data sets, suggesting it is possible to automatically patch adversarial vulnerabilities.
arXiv Detail & Related papers (2020-08-12T11:44:01Z) - Robust Deep Learning as Optimal Control: Insights and Convergence
Guarantees [19.28405674700399]
adversarial examples during training is a popular defense mechanism against adversarial attacks.
By interpreting the min-max problem as an optimal control problem, it has been shown that one can exploit the compositional structure of neural networks.
We provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact methods in optimization.
arXiv Detail & Related papers (2020-05-01T21:26:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.