An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
- URL: http://arxiv.org/abs/2505.18332v1
- Date: Fri, 23 May 2025 19:39:18 GMT
- Title: An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
- Authors: Rahul Thomas, Louai Zahran, Erica Choi, Akilesh Potti, Micah Goldblum, Arka Pal,
- Abstract summary: Recent advances in Large Language Models (LLMs) have led to the widespread adoption of third-party inference services.<n>Existing methods of performing private third-party inference, such as Secure Multiparty Computation (SMPC), often rely on cryptographic methods.<n>We introduce a novel reconstruction technique that can recover original prompts from hidden states with nearly perfect accuracy.
- Score: 31.561665382764076
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in Large Language Models (LLMs) have led to the widespread adoption of third-party inference services, raising critical privacy concerns. Existing methods of performing private third-party inference, such as Secure Multiparty Computation (SMPC), often rely on cryptographic methods. However, these methods are thousands of times slower than standard unencrypted inference, and fail to scale to large modern LLMs. Therefore, recent lines of work have explored the replacement of expensive encrypted nonlinear computations in SMPC with statistical obfuscation methods - in particular, revealing permuted hidden states to the third parties, with accompanying strong claims of the difficulty of reversal into the unpermuted states. In this work, we begin by introducing a novel reconstruction technique that can recover original prompts from hidden states with nearly perfect accuracy across multiple state-of-the-art LLMs. We then show that extensions of our attack are nearly perfectly effective in reversing permuted hidden states of LLMs, demonstrating the insecurity of three recently proposed privacy schemes. We further dissect the shortcomings of prior theoretical `proofs' of permuation security which allow our attack to succeed. Our findings highlight the importance of rigorous security analysis in privacy-preserving LLM inference.
Related papers
- Depth Gives a False Sense of Privacy: LLM Internal States Inversion [17.639108495452785]
Large Language Models (LLMs) are increasingly integrated into daily routines, yet they raise significant privacy and safety concerns.<n>Recent research proposes collaborative inference, which outsources the early-layer inference to ensure data locality.<n>We propose four inversion attacks that significantly improve the semantic similarity and token matching rate of inverted inputs.
arXiv Detail & Related papers (2025-07-22T09:15:11Z) - Cascade: Token-Sharded Private LLM Inference [31.561665382764076]
We propose a new multi-party inference protocol, Cascade, that avoids punitive costs by leveraging sharding in the sequence dimension to maintain privacy.<n>We demonstrate that Cascade is resistant to a generalization of a recent attack that is highly effective against other statistical privacy schemes.
arXiv Detail & Related papers (2025-07-07T17:37:16Z) - SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks [17.77094760401298]
We study the vulnerability of fine-tuned large language models to membership inference attacks (MIAs)<n>We propose SOFT, a novel defense technique that mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection.
arXiv Detail & Related papers (2025-06-12T07:23:56Z) - Saffron-1: Safety Inference Scaling [69.61130284742353]
SAFFRON is a novel inference scaling paradigm tailored explicitly for safety assurance.<n>Central to our approach is the introduction of a multifurcation reward model (MRM) that significantly reduces the required number of reward model evaluations.<n>We publicly release our trained multifurcation reward model (Saffron-1) and the accompanying token-level safety reward dataset (Safety4M)
arXiv Detail & Related papers (2025-06-06T18:05:45Z) - Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models [50.89022445197919]
Large Audio Language Models (LALMs) have extended the capabilities of Large Language Models (LLMs)<n>Recent research has revealed that LALMs remain vulnerable to harmful queries due to insufficient safety-alignment.
arXiv Detail & Related papers (2025-05-26T08:25:25Z) - Private Language Models via Truncated Laplacian Mechanism [18.77713904999236]
We propose a novel private embedding method called the high dimensional truncated Laplacian mechanism.
We show that our method has a lower variance compared to the previous private word embedding methods.
Remarkably, even in the high privacy regime, our approach only incurs a slight decrease in utility compared to the non-private scenario.
arXiv Detail & Related papers (2024-10-10T15:25:02Z) - The Early Bird Catches the Leak: Unveiling Timing Side Channels in LLM Serving Systems [26.528288876732617]
A set of new timing side channels can be exploited to infer confidential system prompts and those issued by other users.<n>These vulnerabilities echo security challenges observed in traditional computing systems.<n>We propose a token-by-token search algorithm to efficiently recover shared prompt prefixes in the caches.
arXiv Detail & Related papers (2024-09-30T06:55:00Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings [57.136748215262884]
We introduce ObscurePrompt for jailbreaking LLMs, inspired by the observed fragile alignments in Out-of-Distribution (OOD) data.<n>We first formulate the decision boundary in the jailbreaking process and then explore how obscure text affects LLM's ethical decision boundary.<n>Our approach substantially improves upon previous methods in terms of attack effectiveness, maintaining efficacy against two prevalent defense mechanisms.
arXiv Detail & Related papers (2024-06-19T16:09:58Z) - ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings [58.82536530615557]
We propose an Adversarial Suffix Embedding Translation Framework (ASETF) to transform continuous adversarial suffix embeddings into coherent and understandable text.
Our method significantly reduces the computation time of adversarial suffixes and achieves a much better attack success rate to existing techniques.
arXiv Detail & Related papers (2024-02-25T06:46:27Z) - SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks [99.23352758320945]
We propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on large language models (LLMs)
Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs.
arXiv Detail & Related papers (2023-10-05T17:01:53Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.