The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization
- URL: http://arxiv.org/abs/2508.18976v1
- Date: Tue, 26 Aug 2025 12:22:45 GMT
- Title: The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization
- Authors: Stephen Meisenbacher, Alexandra Klymenko, Andreea-Elena Bodea, Florian Matthes,
- Abstract summary: We show that Language Large Models (LLMs) can exploit the contextual vulnerability of DP-sanitized texts.<n>Experiments uncover a double-edged sword effect of LLM reconstructions on privacy and utility.<n>We propose recommendations for using data reconstruction as a post-processing step.
- Score: 53.51921540246166
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differentially private text sanitization refers to the process of privatizing texts under the framework of Differential Privacy (DP), providing provable privacy guarantees while also empirically defending against adversaries seeking to harm privacy. Despite their simplicity, DP text sanitization methods operating at the word level exhibit a number of shortcomings, among them the tendency to leave contextual clues from the original texts due to randomization during sanitization $\unicode{x2013}$ this we refer to as $\textit{contextual vulnerability}$. Given the powerful contextual understanding and inference capabilities of Large Language Models (LLMs), we explore to what extent LLMs can be leveraged to exploit the contextual vulnerability of DP-sanitized texts. We expand on previous work not only in the use of advanced LLMs, but also in testing a broader range of sanitization mechanisms at various privacy levels. Our experiments uncover a double-edged sword effect of LLM-based data reconstruction attacks on privacy and utility: while LLMs can indeed infer original semantics and sometimes degrade empirical privacy protections, they can also be used for good, to improve the quality and privacy of DP-sanitized texts. Based on our findings, we propose recommendations for using LLM data reconstruction as a post-processing step, serving to increase privacy protection by thinking adversarially.
Related papers
- When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing [61.80513991207956]
This work focuses on the challenge of how to restore surrogate-driven protected data in diverse MLLM scenarios.<n>We first bridge this research gap by contributing the SPPE (Surrogate Privacy Protected Editable) dataset.<n>We introduce a unified approach that reliably reconstructs private content while preserving the fidelity of MLLM-generated edits.
arXiv Detail & Related papers (2025-12-08T04:59:03Z) - DP-Fusion: Token-Level Differentially Private Inference for Large Language Models [51.71591819896191]
Large language models (LLMs) do not preserve privacy at inference-time.<n> DP-Fusion provably bounds the influence a set of tokens in the context can have on the LLM's output.<n>We show that our method creates token-level provably privatized documents with substantially improved theoretical and empirical privacy.
arXiv Detail & Related papers (2025-07-06T20:49:39Z) - SoK: Semantic Privacy in Large Language Models [24.99241770349404]
This paper introduces a lifecycle-centric framework to analyze semantic privacy risks across input processing, pretraining, fine-tuning, and alignment stages of Large Language Models (LLMs)<n>We categorize key attack vectors and assess how current defenses, such as differential privacy, embedding encryption, edge computing, and unlearning, address these threats.<n>We conclude by outlining open challenges, including quantifying semantic leakage, protecting multimodal inputs, balancing de-identification with generation quality, and ensuring transparency in privacy enforcement.
arXiv Detail & Related papers (2025-06-30T08:08:15Z) - Truthful Text Sanitization Guided by Inference Attacks [3.3802914883339557]
We introduce a novel text sanitization method based on generalizations that subsume the semantic content of the original text spans.<n>The approach relies on the use of instruction-tuned large language models (LLMs) and is divided into two stages.<n>Results on the Text Anonymization Benchmark show that the proposed approach, implemented with Mistral 7B Instruct, leads to enhanced utility.
arXiv Detail & Related papers (2024-12-17T14:07:01Z) - Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions [4.372695214012181]
We show that it is difficult to anticipate the performance of an LLM on such sanitized prompts.<n>Poor performance has clear monetary consequences for LLM services charging on a pay-per-use model.<n>We propose a architecture leveraging a Small Language Model to predict the utility of a given sanitized prompt before it is sent to the LLM.
arXiv Detail & Related papers (2024-11-18T12:31:22Z) - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [63.91178922306669]
We introduce Silent Guardian, a text protection mechanism against large language models (LLMs)
By carefully modifying the text to be protected, TPE can induce LLMs to first sample the end token, thus directly terminating the interaction.
We show that SG can effectively protect the target text under various configurations and achieve almost 100% protection success rate in some cases.
arXiv Detail & Related papers (2023-12-15T10:30:36Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection [6.201275002179716]
We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
arXiv Detail & Related papers (2023-09-06T14:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.