I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference
- URL: http://arxiv.org/abs/2505.06738v3
- Date: Sun, 15 Jun 2025 08:41:09 GMT
- Title: I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference
- Authors: Zibo Gao, Junjie Hu, Feng Guo, Yixin Zhang, Yinglong Han, Siyuan Liu, Haiyang Li, Zhiqiang Lv,
- Abstract summary: Large Language Models (LLMs) that can be deployed locally have recently gained popularity for privacy-sensitive tasks.<n>We unveil novel side-channel vulnerabilities in local LLM inference, which can expose both the victim's input and output text.<n>We design a novel eavesdropping attack framework targeting both open-source and proprietary LLM inference systems.
- Score: 19.466754645346175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) that can be deployed locally have recently gained popularity for privacy-sensitive tasks, with companies such as Meta, Google, and Intel playing significant roles in their development. However, the security of local LLMs through the lens of hardware cache side-channels remains unexplored. In this paper, we unveil novel side-channel vulnerabilities in local LLM inference: token value and token position leakage, which can expose both the victim's input and output text, thereby compromising user privacy. Specifically, we found that adversaries can infer the token values from the cache access patterns of the token embedding operation, and deduce the token positions from the timing of autoregressive decoding phases. To demonstrate the potential of these leaks, we design a novel eavesdropping attack framework targeting both open-source and proprietary LLM inference systems. The attack framework does not directly interact with the victim's LLM and can be executed without privilege. We evaluate the attack on a range of practical local LLM deployments (e.g., Llama, Falcon, and Gemma), and the results show that our attack achieves promising accuracy. The restored output and input text have an average edit distance of 5.2% and 17.3% to the ground truth, respectively. Furthermore, the reconstructed texts achieve average cosine similarity scores of 98.7% (input) and 98.0% (output).
Related papers
- Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers [61.57691030102618]
We propose a novel jailbreaking method, Paper Summary Attack (llmnamePSA)<n>It synthesizes content from either attack-focused or defense-focused LLM safety paper to construct an adversarial prompt template.<n>Experiments show significant vulnerabilities not only in base LLMs, but also in state-of-the-art reasoning model like Deepseek-R1.
arXiv Detail & Related papers (2025-07-17T18:33:50Z) - Defending against Indirect Prompt Injection by Instruction Detection [81.98614607987793]
We propose a novel approach that takes external data as input and leverages the behavioral state of LLMs during both forward and backward propagation to detect potential IPI attacks.<n>Our approach achieves a detection accuracy of 99.60% in the in-domain setting and 96.90% in the out-of-domain setting, while reducing the attack success rate to just 0.12% on the BIPIA benchmark.
arXiv Detail & Related papers (2025-05-08T13:04:45Z) - Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models [4.5987419425784966]
We introduce Spill The Beans, a novel application of cache side-channels to leak tokens generated by Large Language Models (LLMs)<n>A significant challenge is the massive size of LLMs, which, by nature of their compute intensive operation, quickly evicts embedding vectors from the cache.<n>Monitoring more tokens increases potential vocabulary leakage but raises the chance of missing cache hits due to eviction.<n>Our findings reveal a new vulnerability in LLM deployments, highlighting that even sophisticated models are susceptible to traditional side-channel attacks.
arXiv Detail & Related papers (2025-05-01T19:18:56Z) - Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models [7.686540586889241]
This paper demonstrates a new side-channel that enables an adversary to extract sensitive information about inference inputs in large language models (LLMs)<n>We construct attacks using this side-channel in two common LLM tasks: recovering the target language in machine translation tasks and recovering the output class in classification tasks.<n>Our experiments show that an adversary can learn the output language in translation tasks with more than 75% precision across three different models.
arXiv Detail & Related papers (2024-12-19T22:29:58Z) - Pathway to Secure and Trustworthy ZSM for LLMs: Attacks, Defense, and Opportunities [11.511012020557326]
This paper explores the security vulnerabilities associated with fine-tuning large language models (LLMs) in ZSM networks.<n>We show that the membership inference attacks are effective for any downstream task, which can lead to a personal data breach when using LLM as a service.
arXiv Detail & Related papers (2024-08-01T17:15:13Z) - Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context [49.13497493053742]
This research explores converting a nonsensical suffix attack into a sensible prompt via a situation-driven contextual re-writing.
We combine an independent, meaningful adversarial insertion and situations derived from movies to check if this can trick an LLM.
Our approach demonstrates that a successful situation-driven attack can be executed on both open-source and proprietary LLMs.
arXiv Detail & Related papers (2024-07-19T19:47:26Z) - Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models [79.76293901420146]
Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial.
Our research investigates the fragility of uncertainty estimation and explores potential attacks.
We demonstrate that an attacker can embed a backdoor in LLMs, which, when activated by a specific trigger in the input, manipulates the model's uncertainty without affecting the final output.
arXiv Detail & Related papers (2024-07-15T23:41:11Z) - SPOT: Text Source Prediction from Originality Score Thresholding [6.790905400046194]
countermeasures aim at detecting misinformation, usually involve domain specific models trained to recognize the relevance of any information.
Instead of evaluating the validity of the information, we propose to investigate LLM generated text from the perspective of trust.
arXiv Detail & Related papers (2024-05-30T21:51:01Z) - Hidden in Plain Sight: Exploring Chat History Tampering in Interactive Language Models [12.920884182101142]
Large Language Models (LLMs) have become prevalent in real-world applications, exhibiting impressive text generation performance.
To behave interactively, LLM-based chat systems must integrate prior chat history as context into their inputs, following a pre-defined structure.
This paper introduces a systematic methodology to inject user-supplied history into LLM conversations without any prior knowledge of the target model.
arXiv Detail & Related papers (2024-05-30T16:36:47Z) - Defending Against Indirect Prompt Injection Attacks With Spotlighting [11.127479817618692]
In common applications, multiple inputs can be processed by concatenating them together into a single stream of text.
Indirect prompt injection attacks take advantage of this vulnerability by embedding adversarial instructions into untrusted data being processed alongside user commands.
We introduce spotlighting, a family of prompt engineering techniques that can be used to improve LLMs' ability to distinguish among multiple sources of input.
arXiv Detail & Related papers (2024-03-20T15:26:23Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.<n>Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.<n>We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks [99.23352758320945]
We propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on large language models (LLMs)
Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs.
arXiv Detail & Related papers (2023-10-05T17:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.