TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans
- URL: http://arxiv.org/abs/2412.07636v1
- Date: Tue, 10 Dec 2024 16:16:22 GMT
- Title: TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans
- Authors: Md Omar Faruque, Peter Jamieson, Ahmad Patooghy, Abdel-Hameed A. Badawy,
- Abstract summary: Existing Hardware Trojans (HT) detection methods face several critical limitations.<n>The emergence of Large Language Models (LLMs) offers a promising new direction for HT detection.<n>This paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Hardware Trojans (HT) detection methods face several critical limitations: logic testing struggles with scalability and coverage for large designs, side-channel analysis requires golden reference chips, and formal verification methods suffer from state-space explosion. The emergence of Large Language Models (LLMs) offers a promising new direction for HT detection by leveraging their natural language understanding and reasoning capabilities. For the first time, this paper explores the potential of general-purpose LLMs in detecting various HTs inserted in Register Transfer Level (RTL) designs, including SRAM, AES, and UART modules. We propose a novel tool for this goal that systematically assesses state-of-the-art LLMs (GPT-4o, Gemini 1.5 pro, and Llama 3.1) in detecting HTs without prior fine-tuning. To address potential training data bias, the tool implements perturbation techniques, i.e., variable name obfuscation, and design restructuring, that make the cases more sophisticated for the used LLMs. Our experimental evaluation demonstrates perfect detection rates by GPT-4o and Gemini 1.5 pro in baseline scenarios (100%/100% precision/recall), with both models achieving better trigger line coverage (TLC: 0.82-0.98) than payload line coverage (PLC: 0.32-0.46). Under code perturbation, while Gemini 1.5 pro maintains perfect detection performance (100%/100%), GPT-4o (100%/85.7%) and Llama 3.1 (66.7%/85.7%) show some degradation in detection rates, and all models experience decreased accuracy in localizing both triggers and payloads. This paper validates the potential of LLM approaches for hardware security applications, highlighting areas for future improvement.
Related papers
- TrojanGYM: A Detector-in-the-Loop LLM for Adaptive RTL Hardware Trojan Insertion [14.250356355764389]
Hardware Trojans (HTs) remain a critical threat because learning detectors often overfit to trigger/payload patterns and small, stylized benchmarks.<n>We introduce TrojanGYM, a framework that automatically curates HT insertions to expose detector blind spots.<n>We also propose Robust-GNN4TJ, a new implementation of the GNN4TJ with improved graph extraction, training, and prediction reliability.
arXiv Detail & Related papers (2026-01-23T21:11:44Z) - Lightweight LLMs for Network Attack Detection in IoT Networks [0.7879756662633696]
Internet of Things (IoT) devices have increased the scale and diversity of cyberattacks, exposing limitations in traditional intrusion detection systems.<n>This study investigates lightweight decoder-only Large Language Models (LLMs) for IoT attack detection by integrating structured-to-text conversion, Quantized Low-Rank Adaptation (QLoRA) fine-tuning, and Retrieval-Augmented Generation (RAG)
arXiv Detail & Related papers (2026-01-21T18:52:26Z) - From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization [14.474705451897691]
We present T2L-Agent, a project-level, end-to-end framework that plans its own analysis and narrows scope from modules to exact vulnerable lines.<n>We benchmark T2L-ARVO, a diverse, expert-verified 50-case benchmark spanning five crash families and real-world projects.<n>On T2L-ARVO, T2L-Agent achieves up to 58.0% detection and 54.8% line-level localization, substantially outperforming baselines.
arXiv Detail & Related papers (2025-09-30T22:27:18Z) - DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models [60.713908578319256]
We propose Direct Discrepancy Learning (DDL) to optimize the detector with task-oriented knowledge.<n>Built upon this, we introduce DetectAnyLLM, a unified detection framework that achieves state-of-the-art MGTD performance.<n>MIRAGE samples human-written texts from 10 corpora across 5 text-domains, which are then re-generated or revised using 17 cutting-edge LLMs.
arXiv Detail & Related papers (2025-09-15T10:59:57Z) - MCP-Guard: A Defense Framework for Model Context Protocol Integrity in Large Language Model Applications [21.70488724213541]
integration of Large Language Models with external tools introduces critical security vulnerabilities.<n>We propose MCP-Guard, a robust, layered defense architecture designed for LLM--tool interactions.<n>We also introduce MCP-AttackBench, a benchmark of over 70,000 samples.
arXiv Detail & Related papers (2025-08-14T18:00:25Z) - Defending against Indirect Prompt Injection by Instruction Detection [109.30156975159561]
InstructDetector is a novel detection-based approach that leverages the behavioral states of LLMs to identify potential IPI attacks.<n>InstructDetector achieves a detection accuracy of 99.60% in the in-domain setting and 96.90% in the out-of-domain setting, and reduces the attack success rate to just 0.03% on the BIPIA benchmark.
arXiv Detail & Related papers (2025-05-08T13:04:45Z) - Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM [0.7018579932647147]
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts.
This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection.
arXiv Detail & Related papers (2025-04-07T12:32:14Z) - Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation [2.787944528438214]
Static Application Security Testing (SAST) tools are critical to software quality, identifying potential code issues early in development.<n>They often produce false positive warnings that require manual review, slowing down development.<n>We propose LLM4FPM, a lightweight and efficient false positive mitigation framework.
arXiv Detail & Related papers (2024-11-05T13:24:56Z) - Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities [63.603861880022954]
We introduce ADV-LLM, an iterative self-tuning process that crafts adversarial LLMs with enhanced jailbreak ability.
Our framework significantly reduces the computational cost of generating adversarial suffixes while achieving nearly 100% ASR on various open-source LLMs.
It exhibits strong attack transferability to closed-source models, achieving 99% ASR on GPT-3.5 and 49% ASR on GPT-4, despite being optimized solely on Llama3.
arXiv Detail & Related papers (2024-10-24T06:36:12Z) - Intent Detection in the Age of LLMs [3.755082744150185]
Intent detection is a critical component of task-oriented dialogue systems (TODS)
Traditional approaches relied on computationally efficient supervised sentence transformer encoder models.
The emergence of generative large language models (LLMs) with intrinsic world knowledge presents new opportunities to address these challenges.
arXiv Detail & Related papers (2024-10-02T15:01:55Z) - DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM [81.75988648572347]
We present DetToolChain, a novel prompting paradigm to unleash the zero-shot object detection ability of multimodal large language models (MLLMs)
Our approach consists of a detection prompting toolkit inspired by high-precision detection priors and a new Chain-of-Thought to implement these prompts.
We show that GPT-4V with our DetToolChain improves state-of-the-art object detectors by +21.5% AP50 on MS Novel class set for open-vocabulary detection.
arXiv Detail & Related papers (2024-03-19T06:54:33Z) - Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes [61.916827858666906]
Large Language Models (LLMs) are becoming a prominent generative AI tool, where the user enters a query and the LLM generates an answer.
To reduce harm and misuse, efforts have been made to align these LLMs to human values using advanced training techniques such as Reinforcement Learning from Human Feedback.
Recent studies have highlighted the vulnerability of LLMs to adversarial jailbreak attempts aiming at subverting the embedded safety guardrails.
This paper proposes a method called Gradient Cuff to detect jailbreak attempts.
arXiv Detail & Related papers (2024-03-01T03:29:54Z) - Game of Trojans: Adaptive Adversaries Against Output-based
Trojaned-Model Detectors [11.825974900783844]
We analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors.
We show that such an adversary can ensure (1) high accuracy on both trigger-embedded and clean samples and (2) bypass detection.
arXiv Detail & Related papers (2024-02-12T20:14:46Z) - Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities [12.82645410161464]
We evaluate the effectiveness of 16 pre-trained Large Language Models on 5,000 code samples from five diverse security datasets.
Overall, LLMs show modest effectiveness in detecting vulnerabilities, obtaining an average accuracy of 62.8% and F1 score of 0.71 across datasets.
We find that advanced prompting strategies that involve step-by-step analysis significantly improve performance of LLMs on real-world datasets in terms of F1 score (by upto 0.18 on average)
arXiv Detail & Related papers (2023-11-16T13:17:20Z) - Zero-Shot Detection of Machine-Generated Codes [83.0342513054389]
This work proposes a training-free approach for the detection of LLMs-generated codes.
We find that existing training-based or zero-shot text detectors are ineffective in detecting code.
Our method exhibits robustness against revision attacks and generalizes well to Java codes.
arXiv Detail & Related papers (2023-10-08T10:08:21Z) - Large Language Models for Test-Free Fault Localization [11.080712737595174]
We propose a language model based fault localization approach that locates buggy lines of code without any test coverage information.
We fine-tune language models with 350 million, 6 billion, and 16 billion parameters on small, manually curated corpora of buggy programs.
Our empirical evaluation shows that LLMAO improves the Top-1 results over the state-of-the-art machine learning fault localization (MLFL) baselines by 2.3%-54.4%, and Top-5 results by 14.4%-35.6%.
arXiv Detail & Related papers (2023-10-03T01:26:39Z) - DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability
Curvature [143.5381108333212]
We show that text sampled from an large language model tends to occupy negative curvature regions of the model's log probability function.
We then define a new curvature-based criterion for judging if a passage is generated from a given LLM.
We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection.
arXiv Detail & Related papers (2023-01-26T18:44:06Z) - Efficient Decoder-free Object Detection with Transformers [75.00499377197475]
Vision transformers (ViTs) are changing the landscape of object detection approaches.
We propose a decoder-free fully transformer-based (DFFT) object detector.
DFFT_SMALL achieves high efficiency in both training and inference stages.
arXiv Detail & Related papers (2022-06-14T13:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.