The Feasibility and Inevitability of Stealth Attacks
- URL: http://arxiv.org/abs/2106.13997v1
- Date: Sat, 26 Jun 2021 10:50:07 GMT
- Title: The Feasibility and Inevitability of Stealth Attacks
- Authors: Ivan Y. Tyukin, Desmond J. Higham, Eliyas Woldegeorgis, Alexander N.
Gorban
- Abstract summary: We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems.
In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
- Score: 63.14766152741211
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop and study new adversarial perturbations that enable an attacker to
gain control over decisions in generic Artificial Intelligence (AI) systems
including deep learning neural networks. In contrast to adversarial data
modification, the attack mechanism we consider here involves alterations to the
AI system itself. Such a stealth attack could be conducted by a mischievous,
corrupt or disgruntled member of a software development team. It could also be
made by those wishing to exploit a "democratization of AI" agenda, where
network architectures and trained parameter sets are shared publicly. Building
on work by [Tyukin et al., International Joint Conference on Neural Networks,
2020], we develop a range of new implementable attack strategies with
accompanying analysis, showing that with high probability a stealth attack can
be made transparent, in the sense that system performance is unchanged on a
fixed validation set which is unknown to the attacker, while evoking any
desired output on a trigger input of interest. The attacker only needs to have
estimates of the size of the validation set and the spread of the AI's relevant
latent space. In the case of deep learning neural networks, we show that a one
neuron attack is possible - a modification to the weights and bias associated
with a single neuron - revealing a vulnerability arising from
over-parameterization. We illustrate these concepts in a realistic setting.
Guided by the theory and computational results, we also propose strategies to
guard against stealth attacks.
Related papers
- Revealing Vulnerabilities of Neural Networks in Parameter Learning and Defense Against Explanation-Aware Backdoors [2.1165011830664673]
Blinding attacks can drastically alter a machine learning algorithm's prediction and explanation.
We leverage statistical analysis to highlight the changes in CNN weights within a CNN following blinding attacks.
We introduce a method specifically designed to limit the effectiveness of such attacks during the evaluation phase.
arXiv Detail & Related papers (2024-03-25T09:36:10Z) - Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the
Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers.
We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions.
We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z) - TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep
Neural Network Systems [15.982408142401072]
Deep neural networks are vulnerable to attacks from adversarial inputs and, more recently, Trojans to misguide or hijack the decision of the model.
A TnT is universal because any input image captured with a TnT in the scene will: i) misguide a network (untargeted attack); or ii) force the network to make a malicious decision.
We show a generalization of the attack to create patches achieving higher attack success rates than existing state-of-the-art methods.
arXiv Detail & Related papers (2021-11-19T01:35:10Z) - Automating Privilege Escalation with Deep Reinforcement Learning [71.87228372303453]
In this work, we exemplify the potential threat of malicious actors using deep reinforcement learning to train automated agents.
We present an agent that uses a state-of-the-art reinforcement learning algorithm to perform local privilege escalation.
Our agent is usable for generating realistic attack sensor data for training and evaluating intrusion detection systems.
arXiv Detail & Related papers (2021-10-04T12:20:46Z) - Adversarial Attack Attribution: Discovering Attributable Signals in
Adversarial ML Attacks [0.7883722807601676]
Even production systems, such as self-driving cars and ML-as-a-service offerings, are susceptible to adversarial inputs.
Can perturbed inputs be attributed to the methods used to generate the attack?
We introduce the concept of adversarial attack attribution and create a simple supervised learning experimental framework to examine the feasibility of discovering attributable signals in adversarial attacks.
arXiv Detail & Related papers (2021-01-08T08:16:41Z) - An Empirical Review of Adversarial Defenses [0.913755431537592]
Deep neural networks, which form the basis of such systems, are highly susceptible to a specific type of attack, called adversarial attacks.
A hacker can, even with bare minimum computation, generate adversarial examples (images or data points that belong to another class, but consistently fool the model to get misclassified as genuine) and crumble the basis of such algorithms.
We show two effective techniques, namely Dropout and Denoising Autoencoders, and show their success in preventing such attacks from fooling the model.
arXiv Detail & Related papers (2020-12-10T09:34:41Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - On Adversarial Examples and Stealth Attacks in Artificial Intelligence
Systems [62.997667081978825]
We present a formal framework for assessing and analyzing two classes of malevolent action towards generic Artificial Intelligence (AI) systems.
The first class involves adversarial examples and concerns the introduction of small perturbations of the input data that cause misclassification.
The second class, introduced here for the first time and named stealth attacks, involves small perturbations to the AI system itself.
arXiv Detail & Related papers (2020-04-09T10:56:53Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.