Related papers: DETERRENT: Detecting Trojans using Reinforcement Learning

DETERRENT: Detecting Trojans using Reinforcement Learning

URL: http://arxiv.org/abs/2208.12878v1
Date: Fri, 26 Aug 2022 22:09:47 GMT
Title: DETERRENT: Detecting Trojans using Reinforcement Learning
Authors: Vasudev Gohil, Satwik Patnaik, Hao Guo, Dileep Kalathil, Jeyavijayan (JV) Rajendran
Abstract summary: Hardware Trojans (HTs) are a pernicious threat to integrated circuits. In this work, we design a reinforcement learning (RL) agent that circumvents the exponential search space and returns a minimal set of patterns that is most likely to detect HTs.
Score: 8.9149615294509
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Insertion of hardware Trojans (HTs) in integrated circuits is a pernicious threat. Since HTs are activated under rare trigger conditions, detecting them using random logic simulations is infeasible. In this work, we design a reinforcement learning (RL) agent that circumvents the exponential search space and returns a minimal set of patterns that is most likely to detect HTs. Experimental results on a variety of benchmarks demonstrate the efficacy and scalability of our RL agent, which obtains a significant reduction ($169\times$) in the number of test patterns required while maintaining or improving coverage ($95.75\%$) compared to the state-of-the-art techniques.

Related papers

Learning to Reason without External Rewards [100.27210579418562]
Training large language models (LLMs) for complex reasoning via Reinforcement Learning with Verifiable Rewards (RLVR) is effective but limited by reliance on costly, domain-specific supervision.<n>We explore Reinforcement Learning from Internal Feedback (RLIF), a framework that enables LLMs to learn from intrinsic signals without external rewards or labeled data.<n>We propose Intuitor, an RLIF method that uses a model's own confidence, termed self-certainty, as its sole reward signal.
arXiv Detail & Related papers (2025-05-26T07:01:06Z)
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models [70.03122709795122]
We propose a comprehensive defense method named T2IShield to detect, localize, and mitigate backdoor attacks. We find the "Assimilation Phenomenon" on the cross-attention maps caused by the backdoor trigger. For backdoor sample detection, T2IShield achieves a detection F1 score of 88.9$%$ with low computational cost.
arXiv Detail & Related papers (2024-07-05T01:53:21Z)
TrojanForge: Generating Adversarial Hardware Trojan Examples with Reinforcement Learning [0.0]
Hardware Trojan problem can be thought of as a continuous game between attackers and defenders. Machine Learning has recently played a key role in advancing HT research. TrojanForge generates adversarial examples that defeat HT detectors.
arXiv Detail & Related papers (2024-05-24T03:37:32Z)
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal. Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths. Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z)
The Seeker's Dilemma: Realistic Formulation and Benchmarking for Hardware Trojan Detection [0.0]
This work focuses on advancing security research in the hardware design space by formally defining the realistic problem of Hardware Trojan (HT) detection. The goal is to model HT detection more closely to the real world, i.e., describing the problem as "The Seeker's Dilemma" We create a benchmark that consists of a mixture of HT-free and HT-infected restructured circuits.
arXiv Detail & Related papers (2024-02-27T22:14:01Z)
Trojan Playground: A Reinforcement Learning Framework for Hardware Trojan Insertion and Detection [0.0]
Current Hardware Trojan (HT) detection techniques are mostly developed based on a limited set of HT benchmarks. We introduce the first automated Reinforcement Learning (RL) HT insertion and detection framework to address these shortcomings.
arXiv Detail & Related papers (2023-05-16T16:42:07Z)
Multi-criteria Hardware Trojan Detection: A Reinforcement Learning Approach [0.0]
Hardware Trojans (HTs) can severely alter the security and functionality of digital integrated circuits. This paper proposes a multi-criteria reinforcement learning (RL) HT detection tool that features a tunable reward function for different HT detection scenarios. Our preliminary results show an average of 84.2% successful HT detection in ISCAS-85 benchmark.
arXiv Detail & Related papers (2023-04-26T01:40:55Z)
Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure. We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z)
Hardware Trojan Insertion Using Reinforcement Learning [0.0]
This paper utilizes Reinforcement Learning (RL) as a means to automate the Hardware Trojan (HT) insertion process. An RL agent explores the design space and finds circuit locations that are best for keeping inserted HTs hidden. Our toolset can insert combinational HTs into the ISCAS-85 benchmark suite with variations in HT size and triggering conditions.
arXiv Detail & Related papers (2022-04-09T01:50:03Z)
Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs. We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
Odyssey: Creation, Analysis and Detection of Trojan Models [91.13959405645959]
Trojan attacks interfere with the training pipeline by inserting triggers into some of the training samples and trains the model to act maliciously only for samples that contain the trigger. Existing Trojan detectors make strong assumptions about the types of triggers and attacks. We propose a detector that is based on the analysis of the intrinsic properties; that are affected due to the Trojaning process.
arXiv Detail & Related papers (2020-07-16T06:55:00Z)
Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch. We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types. In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.