TrojanForge: Generating Adversarial Hardware Trojan Examples with Reinforcement Learning
- URL: http://arxiv.org/abs/2405.15184v2
- Date: Mon, 21 Oct 2024 01:09:29 GMT
- Title: TrojanForge: Generating Adversarial Hardware Trojan Examples with Reinforcement Learning
- Authors: Amin Sarihi, Peter Jamieson, Ahmad Patooghy, Abdel-Hameed A. Badawy,
- Abstract summary: Hardware Trojan problem can be thought of as a continuous game between attackers and defenders.
Machine Learning has recently played a key role in advancing HT research.
TrojanForge generates adversarial examples that defeat HT detectors.
- Score: 0.0
- License:
- Abstract: The Hardware Trojan (HT) problem can be thought of as a continuous game between attackers and defenders, each striving to outsmart the other by leveraging any available means for an advantage. Machine Learning (ML) has recently played a key role in advancing HT research. Various novel techniques, such as Reinforcement Learning (RL) and Graph Neural Networks (GNNs), have shown HT insertion and detection capabilities. HT insertion with ML techniques, specifically, has seen a spike in research activity due to the shortcomings of conventional HT benchmarks and the inherent human design bias that occurs when we create them. This work continues this innovation by presenting a tool called TrojanForge, capable of generating HT adversarial examples that defeat HT detectors; demonstrating the capabilities of GAN-like adversarial tools for automatic HT insertion. We introduce an RL environment where the RL insertion agent interacts with HT detectors in an insertion-detection loop where the agent collects rewards based on its success in bypassing HT detectors. Our results show that this process helps inserted HTs evade various HT detectors, achieving high attack success percentages. This tool provides insight into why HT insertion fails in some instances and how we can leverage this knowledge in defense.
Related papers
- SENTAUR: Security EnhaNced Trojan Assessment Using LLMs Against Undesirable Revisions [17.21926121783922]
Hardware Trojan (HT) can introduce stealthy behavior, prevent an IC work as intended, or leak sensitive data via side channels.
To counter HTs, rapidly examining HT scenarios is a key requirement.
We propose a large language model (LLM) framework to generate a suite of legitimate HTs for a Register Transfer Level (RTL) design.
arXiv Detail & Related papers (2024-07-17T07:13:06Z) - Trojan Playground: A Reinforcement Learning Framework for Hardware Trojan Insertion and Detection [0.0]
Current Hardware Trojan (HT) detection techniques are mostly developed based on a limited set of HT benchmarks.
We introduce the first automated Reinforcement Learning (RL) HT insertion and detection framework to address these shortcomings.
arXiv Detail & Related papers (2023-05-16T16:42:07Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - ATTRITION: Attacking Static Hardware Trojan Detection Techniques Using
Reinforcement Learning [6.87143729255904]
We develop an automated, scalable, and practical attack framework, ATTRITION, using reinforcement learning (RL)
ATTRITION evades eight detection techniques across two HT detection categories, showcasing its behavior.
We demonstrate ATTRITION's ability to evade detection techniques by evaluating designs ranging from the widely-used academic suites to larger designs such as the open-source MIPS and mor1kx processors to AES and a GPS module.
arXiv Detail & Related papers (2022-08-26T23:47:47Z) - DETERRENT: Detecting Trojans using Reinforcement Learning [8.9149615294509]
Hardware Trojans (HTs) are a pernicious threat to integrated circuits.
In this work, we design a reinforcement learning (RL) agent that circumvents the exponential search space and returns a minimal set of patterns that is most likely to detect HTs.
arXiv Detail & Related papers (2022-08-26T22:09:47Z) - Attention Hijacking in Trojan Transformers [68.04317938014067]
Trojan attacks pose a severe threat to AI systems.
Recent works on Transformer models received explosive popularity.
Can we reveal the Trojans through attention mechanisms in BERTs and ViTs?
arXiv Detail & Related papers (2022-08-09T04:05:04Z) - Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free [126.15842954405929]
Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a trigger.
We propose a novel Trojan network detection regime: first locating a "winning Trojan lottery ticket" which preserves nearly full Trojan information yet only chance-level performance on clean inputs; then recovering the trigger embedded in this already isolated subnetwork.
arXiv Detail & Related papers (2022-05-24T06:33:31Z) - Hardware Trojan Insertion Using Reinforcement Learning [0.0]
This paper utilizes Reinforcement Learning (RL) as a means to automate the Hardware Trojan (HT) insertion process.
An RL agent explores the design space and finds circuit locations that are best for keeping inserted HTs hidden.
Our toolset can insert combinational HTs into the ISCAS-85 benchmark suite with variations in HT size and triggering conditions.
arXiv Detail & Related papers (2022-04-09T01:50:03Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Odyssey: Creation, Analysis and Detection of Trojan Models [91.13959405645959]
Trojan attacks interfere with the training pipeline by inserting triggers into some of the training samples and trains the model to act maliciously only for samples that contain the trigger.
Existing Trojan detectors make strong assumptions about the types of triggers and attacks.
We propose a detector that is based on the analysis of the intrinsic properties; that are affected due to the Trojaning process.
arXiv Detail & Related papers (2020-07-16T06:55:00Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.