Related papers: Whisper Leak: a side-channel attack on Large Language Models

Whisper Leak: a side-channel attack on Large Language Models

URL: http://arxiv.org/abs/2511.03675v1
Date: Wed, 05 Nov 2025 17:47:46 GMT
Title: Whisper Leak: a side-channel attack on Large Language Models
Authors: Geoff McDonald, Jonathan Bar Or,
Abstract summary: This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic.<n>Despite TLS encryption protecting content, these metadata patterns leak sufficient information to enable topic classification.<n>For many models, we achieve 100% precision in identifying sensitive topics like "money laundering" while recovering 5-20% of target conversations.
Score: 0.2291770711277359
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns in streaming responses. Despite TLS encryption protecting content, these metadata patterns leak sufficient information to enable topic classification. We demonstrate the attack across 28 popular LLMs from major providers, achieving near-perfect classification (often >98% AUPRC) and high precision even at extreme class imbalance (10,000:1 noise-to-target ratio). For many models, we achieve 100% precision in identifying sensitive topics like "money laundering" while recovering 5-20% of target conversations. This industry-wide vulnerability poses significant risks for users under network surveillance by ISPs, governments, or local adversaries. We evaluate three mitigation strategies - random padding, token batching, and packet injection - finding that while each reduces attack effectiveness, none provides complete protection. Through responsible disclosure, we have collaborated with providers to implement initial countermeasures. Our findings underscore the need for LLM providers to address metadata leakage as AI systems handle increasingly sensitive information.

Related papers

NeuroFilter: Privacy Guardrails for Conversational LLM Agents [50.75206727081996]
This work addresses the computational challenge of enforcing privacy for agentic Large Language Models (LLMs)<n>NeuroFilter is a guardrail framework that operationalizes contextual integrity by mapping norm violations to simple directions in the model's activation space.<n>A comprehensive evaluation across over 150,000 interactions, covering models from 7B to 70B parameters, illustrates the strong performance of NeuroFilter.
arXiv Detail & Related papers (2026-01-21T05:16:50Z)
Friend or Foe: How LLMs' Safety Mind Gets Fooled by Intent Shift Attack [53.34204977366491]
Large language models (LLMs) remain vulnerable to jailbreaking attacks despite their impressive capabilities.<n>In this paper, we introduce ISA (Intent Shift Attack), which obfuscates LLMs about the intent of the attacks.<n>Our approach only needs minimal edits to the original request, and yields natural, human-readable, and seemingly harmless prompts.
arXiv Detail & Related papers (2025-11-01T13:44:42Z)
NetEcho: From Real-World Streaming Side-Channels to Full LLM Conversation Recovery [21.94698636997114]
NetEcho is designed to recover entire conversations directly from encrypted network traffic.<n>It can recover $sim$70% information of each conversation, demonstrating a critical limitation in current defense mechanisms.
arXiv Detail & Related papers (2025-10-29T12:47:36Z)
SASER: Stego attacks on open-source LLMs [14.7664610166861]
SASER is the first stego attack on open-source large language models (LLMs)<n>It wields impacts through identifying targeted parameters, embedding payloads, injecting triggers, and executing payloads sequentially.<n>Experiments on LlaMA2-7B and ChatGLM3-6B, without quantization, show that SASER outperforms existing stego attacks by up to 98.1%.
arXiv Detail & Related papers (2025-10-12T07:33:56Z)
The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration [72.33801123508145]
Large language models (LLMs) are integral to multi-agent systems.<n>Privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations.<n>In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information.
arXiv Detail & Related papers (2025-09-16T16:57:25Z)
Multi-Stage Prompt Inference Attacks on Enterprise LLM Systems [18.039444159491733]
Large Language Models (LLMs) deployed in enterprise settings face novel security challenges.<n>One critical threat is prompt inference attacks: adversaries chain together seemingly benign prompts to gradually extract confidential data.<n>We present a comprehensive study of multi-stage prompt inference attacks in an enterprise LLM context.
arXiv Detail & Related papers (2025-07-21T13:38:12Z)
LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities [1.460362586787935]
LiteLMGuard (LLMG) provides real-time, prompt-level defense for quantized SLMs.<n>LLMG formalizes prompt filtering as a deep learning (DL)-based prompt answerability classification task.<n>LLMG defends against over 87% of harmful prompts, including both direct instruction and jailbreak attack strategies.
arXiv Detail & Related papers (2025-05-08T19:58:41Z)
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs)<n>In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents.<n>We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z)
Prompt Leakage effect and defense strategies for multi-turn LLM interactions [95.33778028192593]
Leakage of system prompts may compromise intellectual property and act as adversarial reconnaissance for an attacker. We design a unique threat model which leverages the LLM sycophancy effect and elevates the average attack success rate (ASR) from 17.7% to 86.2% in a multi-turn setting. We measure the mitigation effect of 7 black-box defense strategies, along with finetuning an open-source model to defend against leakage attempts.
arXiv Detail & Related papers (2024-04-24T23:39:58Z)
A Survey on Detection of LLMs-Generated Content [97.87912800179531]
The ability to detect LLMs-generated content has become of paramount importance. We aim to provide a detailed overview of existing detection strategies and benchmarks. We also posit the necessity for a multi-faceted approach to defend against various attacks.
arXiv Detail & Related papers (2023-10-24T09:10:26Z)
Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning [1.9374282535132377]
We study client-side detectability of malicious server (MS) attacks for the first time. We propose SEER, a novel attack framework that satisfies these requirements. We show that SEER can steal user data from gradients of realistic networks, even for large batch sizes of up to 512.
arXiv Detail & Related papers (2023-06-05T16:29:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.