DETERRENT: Detecting Trojans using Reinforcement Learning
- URL: http://arxiv.org/abs/2208.12878v1
- Date: Fri, 26 Aug 2022 22:09:47 GMT
- Title: DETERRENT: Detecting Trojans using Reinforcement Learning
- Authors: Vasudev Gohil, Satwik Patnaik, Hao Guo, Dileep Kalathil, Jeyavijayan
(JV) Rajendran
- Abstract summary: Hardware Trojans (HTs) are a pernicious threat to integrated circuits.
In this work, we design a reinforcement learning (RL) agent that circumvents the exponential search space and returns a minimal set of patterns that is most likely to detect HTs.
- Score: 8.9149615294509
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Insertion of hardware Trojans (HTs) in integrated circuits is a pernicious
threat. Since HTs are activated under rare trigger conditions, detecting them
using random logic simulations is infeasible. In this work, we design a
reinforcement learning (RL) agent that circumvents the exponential search space
and returns a minimal set of patterns that is most likely to detect HTs.
Experimental results on a variety of benchmarks demonstrate the efficacy and
scalability of our RL agent, which obtains a significant reduction
($169\times$) in the number of test patterns required while maintaining or
improving coverage ($95.75\%$) compared to the state-of-the-art techniques.
Related papers
- T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models [70.03122709795122]
We propose a comprehensive defense method named T2IShield to detect, localize, and mitigate backdoor attacks.
We find the "Assimilation Phenomenon" on the cross-attention maps caused by the backdoor trigger.
For backdoor sample detection, T2IShield achieves a detection F1 score of 88.9$%$ with low computational cost.
arXiv Detail & Related papers (2024-07-05T01:53:21Z) - TrojanForge: Generating Adversarial Hardware Trojan Examples with Reinforcement Learning [0.0]
Hardware Trojan problem can be thought of as a continuous game between attackers and defenders.
Machine Learning has recently played a key role in advancing HT research.
TrojanForge generates adversarial examples that defeat HT detectors.
arXiv Detail & Related papers (2024-05-24T03:37:32Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - The Seeker's Dilemma: Realistic Formulation and Benchmarking for
Hardware Trojan Detection [0.0]
This work focuses on advancing security research in the hardware design space by formally defining the realistic problem of Hardware Trojan (HT) detection.
The goal is to model HT detection more closely to the real world, i.e., describing the problem as "The Seeker's Dilemma"
We create a benchmark that consists of a mixture of HT-free and HT-infected restructured circuits.
arXiv Detail & Related papers (2024-02-27T22:14:01Z) - Trojan Playground: A Reinforcement Learning Framework for Hardware Trojan Insertion and Detection [0.0]
Current Hardware Trojan (HT) detection techniques are mostly developed based on a limited set of HT benchmarks.
We introduce the first automated Reinforcement Learning (RL) HT insertion and detection framework to address these shortcomings.
arXiv Detail & Related papers (2023-05-16T16:42:07Z) - Multi-criteria Hardware Trojan Detection: A Reinforcement Learning
Approach [0.0]
Hardware Trojans (HTs) can severely alter the security and functionality of digital integrated circuits.
This paper proposes a multi-criteria reinforcement learning (RL) HT detection tool that features a tunable reward function for different HT detection scenarios.
Our preliminary results show an average of 84.2% successful HT detection in ISCAS-85 benchmark.
arXiv Detail & Related papers (2023-04-26T01:40:55Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Hardware Trojan Insertion Using Reinforcement Learning [0.0]
This paper utilizes Reinforcement Learning (RL) as a means to automate the Hardware Trojan (HT) insertion process.
An RL agent explores the design space and finds circuit locations that are best for keeping inserted HTs hidden.
Our toolset can insert combinational HTs into the ISCAS-85 benchmark suite with variations in HT size and triggering conditions.
arXiv Detail & Related papers (2022-04-09T01:50:03Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Odyssey: Creation, Analysis and Detection of Trojan Models [91.13959405645959]
Trojan attacks interfere with the training pipeline by inserting triggers into some of the training samples and trains the model to act maliciously only for samples that contain the trigger.
Existing Trojan detectors make strong assumptions about the types of triggers and attacks.
We propose a detector that is based on the analysis of the intrinsic properties; that are affected due to the Trojaning process.
arXiv Detail & Related papers (2020-07-16T06:55:00Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.