Evil from Within: Machine Learning Backdoors through Hardware Trojans
- URL: http://arxiv.org/abs/2304.08411v2
- Date: Tue, 18 Apr 2023 07:25:23 GMT
- Title: Evil from Within: Machine Learning Backdoors through Hardware Trojans
- Authors: Alexander Warnecke, Julian Speith, Jan-Niklas M\"oller, Konrad Rieck,
Christof Paar
- Abstract summary: Backdoors pose a serious threat to machine learning, as they can compromise the integrity of security-critical systems, such as self-driving cars.
We introduce a backdoor attack that completely resides within a common hardware accelerator for machine learning.
We demonstrate the practical feasibility of our attack by implanting our hardware trojan into the Xilinx Vitis AI DPU.
- Score: 72.99519529521919
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Backdoors pose a serious threat to machine learning, as they can compromise
the integrity of security-critical systems, such as self-driving cars. While
different defenses have been proposed to address this threat, they all rely on
the assumption that the hardware on which the learning models are executed
during inference is trusted. In this paper, we challenge this assumption and
introduce a backdoor attack that completely resides within a common hardware
accelerator for machine learning. Outside of the accelerator, neither the
learning model nor the software is manipulated, so that current defenses fail.
To make this attack practical, we overcome two challenges: First, as memory on
a hardware accelerator is severely limited, we introduce the concept of a
minimal backdoor that deviates as little as possible from the original model
and is activated by replacing a few model parameters only. Second, we develop a
configurable hardware trojan that can be provisioned with the backdoor and
performs a replacement only when the specific target model is processed. We
demonstrate the practical feasibility of our attack by implanting our hardware
trojan into the Xilinx Vitis AI DPU, a commercial machine-learning accelerator.
We configure the trojan with a minimal backdoor for a traffic-sign recognition
system. The backdoor replaces only 30 (0.069%) model parameters, yet it
reliably manipulates the recognition once the input contains a backdoor
trigger. Our attack expands the hardware circuit of the accelerator by 0.24%
and induces no run-time overhead, rendering a detection hardly possible. Given
the complex and highly distributed manufacturing process of current hardware,
our work points to a new threat in machine learning that is inaccessible to
current security mechanisms and calls for hardware to be manufactured only in
fully trusted environments.
Related papers
- TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models [69.37990698561299]
TrojFM is a novel backdoor attack tailored for very large foundation models.
Our approach injects backdoors by fine-tuning only a very small proportion of model parameters.
We demonstrate that TrojFM can launch effective backdoor attacks against widely used large GPT-style models.
arXiv Detail & Related papers (2024-05-27T03:10:57Z) - Towards Practical Fabrication Stage Attacks Using Interrupt-Resilient Hardware Trojans [4.549209593575401]
We introduce a new class of hardware trojans called interrupt-resilient trojans (IRTs)
IRTs can successfully address the problem of non-deterministic triggering in CPUs.
We show that our design allows for seamless integration during fabrication stage attacks.
arXiv Detail & Related papers (2024-03-15T19:55:23Z) - Fine-Tuning Is All You Need to Mitigate Backdoor Attacks [10.88508085229675]
We show that fine-tuning can effectively remove backdoors from machine learning models while maintaining high model utility.
We coin a new term, namely backdoor sequela, to measure the changes in model vulnerabilities to other attacks before and after the backdoor has been removed.
arXiv Detail & Related papers (2022-12-18T11:30:59Z) - ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled
neural networks [18.337267366258818]
We show that backdoors can be added during compilation, circumventing safeguards in the data preparation and model training stages.
The attacker can not only insert existing weight-based backdoors during compilation, but also a new class of weight-independent backdoors, such as ImpNet.
Some backdoors, including ImpNet, can only be reliably detected at the stage where they are inserted and removing them anywhere else presents a significant challenge.
arXiv Detail & Related papers (2022-09-30T21:59:24Z) - Architectural Backdoors in Neural Networks [27.315196801989032]
We introduce a new class of backdoor attacks that hide inside model architectures.
These backdoors are simple to implement, for instance by publishing open-source code for a backdoored model architecture.
We demonstrate that model architectural backdoors represent a real threat and, unlike other approaches, can survive a complete re-training from scratch.
arXiv Detail & Related papers (2022-06-15T22:44:03Z) - Neurotoxin: Durable Backdoors in Federated Learning [73.82725064553827]
federated learning systems have an inherent vulnerability during their training to adversarial backdoor attacks.
We propose Neurotoxin, a simple one-line modification to existing backdoor attacks that acts by attacking parameters that are changed less in magnitude during training.
arXiv Detail & Related papers (2022-06-12T16:52:52Z) - Trojan Horse Training for Breaking Defenses against Backdoor Attacks in
Deep Learning [7.3007220721129364]
ML models that contain a backdoor are called Trojan models.
Current single-target backdoor attacks require one trigger per target class.
We introduce a new, more general attack that will enable a single trigger to result in misclassification to more than one target class.
arXiv Detail & Related papers (2022-03-25T02:54:27Z) - Few-Shot Backdoor Attacks on Visual Object Tracking [80.13936562708426]
Visual object tracking (VOT) has been widely adopted in mission-critical applications, such as autonomous driving and intelligent surveillance systems.
We show that an adversary can easily implant hidden backdoors into VOT models by tempering with the training process.
We show that our attack is resistant to potential defenses, highlighting the vulnerability of VOT models to potential backdoor attacks.
arXiv Detail & Related papers (2022-01-31T12:38:58Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.