DP-Fusion: Token-Level Differentially Private Inference for Large Language Models
- URL: http://arxiv.org/abs/2507.04531v3
- Date: Sun, 09 Nov 2025 12:39:22 GMT
- Title: DP-Fusion: Token-Level Differentially Private Inference for Large Language Models
- Authors: Rushil Thareja, Preslav Nakov, Praneeth Vepakomma, Nils Lukas,
- Abstract summary: Large language models (LLMs) do not preserve privacy at inference-time.<n> DP-Fusion provably bounds the influence a set of tokens in the context can have on the LLM's output.<n>We show that our method creates token-level provably privatized documents with substantially improved theoretical and empirical privacy.
- Score: 51.71591819896191
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) do not preserve privacy at inference-time. The LLM's outputs can inadvertently reveal information about the model's context, which presents a privacy challenge when the LLM is augmented via tools or databases containing sensitive information. Existing privacy-preserving methods at inference-time have significant limitations since they (i) lack provable guarantees or (ii) have a poor utility/privacy trade-off. We propose DP-Fusion, a Differentially Private Inference (DPI) mechanism for LLMs that provably bounds the influence a set of tokens in the context can have on the LLM's output. DP-Fusion works as follows: (1) label a subset of sensitive tokens, (2) infer the LLM without any sensitive tokens to obtain a baseline, (3) infer the LLM with the sensitive tokens, and (4) blend distributions so that the final output remains within a bounded distance of the baseline distribution. While this per-token influence bound also mitigates jailbreak-style prompt injection, we focus on \emph{document privatization}, where the goal is to paraphrase a document containing sensitive tokens, e.g., personally identifiable information, so that no attacker can reliably infer them from the paraphrased document while preserving high text quality. The privacy/utility trade-off is controlled by $\epsilon$, where $\epsilon=0$ hides sensitive tokens entirely, while higher values trade off privacy for improved text quality. We show that our method creates token-level provably privatized documents with substantially improved theoretical and empirical privacy, achieving $6\times$ lower perplexity than related DPI methods.
Related papers
- Leveraging Semantic Triples for Private Document Generation with Local Differential Privacy Guarantees [18.487751624471777]
We introduce DP-ST, which leverages semantic triples for neighborhood-aware private document generation under local DP guarantees.<n>Our method allows for coherent text generation even at lower $varepsilon$ values, while still balancing privacy and utility.
arXiv Detail & Related papers (2025-08-28T12:59:01Z) - The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization [53.51921540246166]
We show that Language Large Models (LLMs) can exploit the contextual vulnerability of DP-sanitized texts.<n>Experiments uncover a double-edged sword effect of LLM reconstructions on privacy and utility.<n>We propose recommendations for using data reconstruction as a post-processing step.
arXiv Detail & Related papers (2025-08-26T12:22:45Z) - InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy [7.006059299522521]
InvisibleInk is a scalable long-form text generation framework satisfying rigorous differential privacy guarantees.<n>We reduce the privacy cost by isolating and clipping only the sensitive information in the model logits.<n>We improve text quality by sampling from a small superset of the top-$k$ private tokens.
arXiv Detail & Related papers (2025-06-30T18:00:41Z) - Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z) - Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs [49.84954577111077]
Pr$epsilonepsilon$mpt is a novel system that implements a prompt sanitizer.<n>We show that Pr$epsilonepsilon$mpt is a practical method to achieve meaningful privacy guarantees.
arXiv Detail & Related papers (2025-04-07T14:52:40Z) - Differentially Private In-context Learning via Sampling Few-shot Mixed with Zero-shot Outputs [13.790550802100842]
In-context learning (ICL) can be improved by augmenting prompts with relevant input-output examples (demonstrations)<n>ICL demonstrations can contain privacy-sensitive information, which can be leaked and/or regurgitated by the LLM output.<n>We propose $textttdps-mozo$, a decoding framework that generates DP text by sampling from the product of multiple one-shot outputs mixed with a zero-shot output.
arXiv Detail & Related papers (2025-01-31T16:48:38Z) - Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation [13.692397169805806]
differential privacy (DP) is becoming increasingly popular in NLP.
The choice of granularity at which DP is applied is often neglected.
Our findings indicate that the document-level NMT system is more resistant to membership inference attacks.
arXiv Detail & Related papers (2024-07-26T14:52:37Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Large Language Models as Carriers of Hidden Messages [0.0]
Simple fine-tuning can embed hidden text into large language models (LLMs), which is revealed only when triggered by a specific query.<n>Our work demonstrates that embedding hidden text via fine-tuning, although seemingly secure due to the vast number of potential triggers, is vulnerable to extraction.<n>We introduce an extraction attack called Unconditional Token Forcing (UTF), which iteratively feeds tokens from the LLM's vocabulary to reveal sequences with high token probabilities, indicating hidden text candidates.
arXiv Detail & Related papers (2024-06-04T16:49:06Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Differentially Private Synthetic Data via Foundation Model APIs 2: Text [56.13240830670327]
A lot of high-quality text data generated in the real world is private and cannot be shared or used freely due to privacy concerns.
We propose an augmented PE algorithm, named Aug-PE, that applies to the complex setting of text.
Our results demonstrate that Aug-PE produces DP synthetic text that yields competitive utility with the SOTA DP finetuning baselines.
arXiv Detail & Related papers (2024-03-04T05:57:50Z) - Conciliating Privacy and Utility in Data Releases via Individual Differential Privacy and Microaggregation [4.287502453001108]
$epsilon$-Differential privacy (DP) is a well-known privacy model that offers strong privacy guarantees.
We propose $epsilon$-individual differential privacy (iDP), which causes less data distortion while providing the same protection as DP to subjects.
We report on experiments that show how our approach can provide strong privacy (small $epsilon$) while yielding protected data that do not significantly degrade the accuracy of secondary data analysis.
arXiv Detail & Related papers (2023-12-21T10:23:18Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.