TrojanGYM: A Detector-in-the-Loop LLM for Adaptive RTL Hardware Trojan Insertion
- URL: http://arxiv.org/abs/2601.17178v1
- Date: Fri, 23 Jan 2026 21:11:44 GMT
- Title: TrojanGYM: A Detector-in-the-Loop LLM for Adaptive RTL Hardware Trojan Insertion
- Authors: Saideep Sreekumar, Zeng Wang, Akashdeep Saha, Weihua Xiao, Minghao Shao, Muhammad Shafique, Ozgur Sinanoglu, Ramesh Karri, Johann Knechtel,
- Abstract summary: Hardware Trojans (HTs) remain a critical threat because learning detectors often overfit to trigger/payload patterns and small, stylized benchmarks.<n>We introduce TrojanGYM, a framework that automatically curates HT insertions to expose detector blind spots.<n>We also propose Robust-GNN4TJ, a new implementation of the GNN4TJ with improved graph extraction, training, and prediction reliability.
- Score: 14.250356355764389
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hardware Trojans (HTs) remain a critical threat because learning-based detectors often overfit to narrow trigger/payload patterns and small, stylized benchmarks. We introduce TrojanGYM, an agentic, LLM-driven framework that automatically curates HT insertions to expose detector blind spots while preserving design correctness. Given high-level HT specifications, a suite of cooperating LLM agents (instantiated with GPT-4, LLaMA-3.3-70B, and Gemini-2.5Pro) proposes and refines RTL modifications that realize diverse triggers and payloads without impacting normal functionality. TrojanGYM implements a feedback-driven benchmark generation loop co-designed with HT detectors, in which constraint-aware syntactic checking and GNN-based HT detectors provide feedback that iteratively refines HT specifications and insertion strategies to better surface detector blind spots. We further propose Robust-GNN4TJ, a new implementation of the GNN4TJ with improved graph extraction, training robustness, and prediction reliability, especially on LLM-generated HT designs. On the most challenging TrojanGYM-generated benchmarks, Robust-GNN4TJ raises HT detection rates from 0% to 60% relative to a prior GNN-based detector. We instantiate TrojanGYM on SRAM, AES-128, and UART designs at RTL level, and show that it systematically produces diverse, functionally correct HTs that reach up to 83.33% evasion rates against modern GNN-based detectors, revealing robustness gaps that are not apparent when these detectors are evaluated solely on existing TrustHub-style benchmarks. Post peer-review, we will release all codes and artifacts.
Related papers
- RADAR: Retrieval-Augmented Detector with Adversarial Refinement for Robust Fake News Detection [50.073924438848316]
We present RADAR, a retrieval-augmented detector with adversarial refinement for robust fake news detection.<n>Our approach employs a generator that rewrites real articles with factual perturbations, paired with a lightweight detector that verifies claims using dense passage retrieval.
arXiv Detail & Related papers (2026-01-07T14:52:15Z) - Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective [108.30620357325559]
Existing machine-generated text (MGT) detection methods implicitly assume labels as the "golden standard"<n>We propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions.
arXiv Detail & Related papers (2025-11-02T15:59:31Z) - DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models [60.713908578319256]
We propose Direct Discrepancy Learning (DDL) to optimize the detector with task-oriented knowledge.<n>Built upon this, we introduce DetectAnyLLM, a unified detection framework that achieves state-of-the-art MGTD performance.<n>MIRAGE samples human-written texts from 10 corpora across 5 text-domains, which are then re-generated or revised using 17 cutting-edge LLMs.
arXiv Detail & Related papers (2025-09-15T10:59:57Z) - TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans [0.0]
Existing Hardware Trojans (HT) detection methods face several critical limitations.<n>The emergence of Large Language Models (LLMs) offers a promising new direction for HT detection.<n>This paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs.
arXiv Detail & Related papers (2024-12-10T16:16:22Z) - Unleashing GHOST: An LLM-Powered Framework for Automated Hardware Trojan Design [0.0]
GHOST is an automated attack framework that leverages Large Language Models (LLMs) for rapid HT generation and insertion.<n>Study shows that 100% of GHOST-generated synthizable HTs evaded detection by an ML-generated HT detection tool.
arXiv Detail & Related papers (2024-12-03T20:33:29Z) - SENTAUR: Security EnhaNced Trojan Assessment Using LLMs Against Undesirable Revisions [17.21926121783922]
Hardware Trojan (HT) can introduce stealthy behavior, prevent an IC work as intended, or leak sensitive data via side channels.
To counter HTs, rapidly examining HT scenarios is a key requirement.
We propose a large language model (LLM) framework to generate a suite of legitimate HTs for a Register Transfer Level (RTL) design.
arXiv Detail & Related papers (2024-07-17T07:13:06Z) - Evasive Hardware Trojan through Adversarial Power Trace [6.949268510101616]
We introduce a HT obfuscation (HTO) approach to allow HTs to bypass detection method.
HTO can be implemented with only a single transistor for ASICs and FPGAs.
We show that an adaptive attacker can still design evasive HTOs by constraining the design with a spectral noise budget.
arXiv Detail & Related papers (2024-01-04T16:28:15Z) - Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Trojan Playground: A Reinforcement Learning Framework for Hardware Trojan Insertion and Detection [0.0]
Current Hardware Trojan (HT) detection techniques are mostly developed based on a limited set of HT benchmarks.
We introduce the first automated Reinforcement Learning (RL) HT insertion and detection framework to address these shortcomings.
arXiv Detail & Related papers (2023-05-16T16:42:07Z) - Efficient Decoder-free Object Detection with Transformers [75.00499377197475]
Vision transformers (ViTs) are changing the landscape of object detection approaches.
We propose a decoder-free fully transformer-based (DFFT) object detector.
DFFT_SMALL achieves high efficiency in both training and inference stages.
arXiv Detail & Related papers (2022-06-14T13:22:19Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.