Related papers: Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services

Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services

URL: http://arxiv.org/abs/2510.27016v1
Date: Thu, 30 Oct 2025 21:34:23 GMT
Title: Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services
Authors: Jayden Serenari, Stephen Lee,
Abstract summary: We present a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote Large Language Models (LLMs)<n>Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms.<n>Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models may contain Personally Identifiable Information (PII), which, if exposed, could lead to security breaches or identity theft. To address this challenge, we present the Local Optimizations for Pseudonymization with Semantic Integrity Directed Entity Detection (LOPSIDED) framework, a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote LLMs. Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms, preserving the contextual integrity of conversations. Once the model generates its response, the pseudonyms are automatically depseudonymized, ensuring the user receives an accurate, privacy-preserving output. We evaluate our approach using real-world conversations sourced from ShareGPT, which we further augment and annotate to assess whether named entities are contextually relevant to the model's response. Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques, all while enhancing privacy.

Related papers

PrivacyPAD: A Reinforcement Learning Framework for Dynamic Privacy-Aware Delegation [33.37227619820212]
We introduce a novel reinforcement learning framework called PrivacyPAD to solve this problem.<n>Our framework trains an agent to dynamically route text chunks, learning a policy that optimally balances the trade-off between privacy leakage and task performance.<n>Our framework achieves a new state-of-the-art on the privacy-utility frontier.
arXiv Detail & Related papers (2025-10-16T19:38:36Z)
Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search [60.197239728279534]
Large language models (LLMs) in cloud-based services have raised significant privacy concerns.<n>Existing text anonymization and de-identification techniques, such as rule-based redaction and scrubbing, often struggle to balance privacy preservation with text naturalness and utility.<n>We propose a zero-shot, tree-search-based iterative sentence rewriting algorithm that systematically obfuscates or deletes private information while preserving coherence, relevance, and naturalness.
arXiv Detail & Related papers (2025-09-25T07:23:52Z)
RL-Finetuned LLMs for Privacy-Preserving Synthetic Rewriting [17.294176570269]
We propose a reinforcement learning framework that fine-tunes a large language model (LLM) using a composite reward function.<n>The privacy reward combines semantic cues with structural patterns derived from a minimum spanning tree (MST) over latent representations.<n> Empirical results show that the proposed method significantly enhances author obfuscation and privacy metrics without degrading semantic quality.
arXiv Detail & Related papers (2025-08-25T04:38:19Z)
AgentStealth: Reinforcing Large Language Model for Anonymizing User-generated Text [8.758843436588297]
AgentStealth is a self-reinforcing language model for text anonymization.<n>We show that our method outperforms baselines in both anonymization effectiveness and utility.<n>Our lightweight design supports direct deployment on edge devices, avoiding cloud reliance and communication-based privacy risks.
arXiv Detail & Related papers (2025-06-26T02:48:16Z)
Self-Refining Language Model Anonymizers via Adversarial Distillation [48.280759014096354]
We introduce SElf-refining Anonymization with Language model (SEAL)<n>SEAL is a novel distillation framework for training small language models (SLMs) to perform effective anonymization without relying on external models at inference time.<n>Experiments on SynthPAI, a dataset of synthetic personal profiles and text comments, demonstrate that SLMs trained with SEAL achieve substantial improvements in anonymization capabilities.
arXiv Detail & Related papers (2025-06-02T08:21:27Z)
Automated Privacy Information Annotation in Large Language Model Interactions [40.87806981624453]
Users interacting with large language models (LLMs) under their real identifiers often unknowingly risk disclosing private information.<n>Existing privacy detection methods were designed for different objectives and application domains.<n>We construct a large-scale multilingual dataset with 249K user queries and 154K annotated privacy phrases.
arXiv Detail & Related papers (2025-05-27T09:00:12Z)
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents [33.26308626066122]
We characterize the notion of contextual privacy for user interactions with Conversational Agents (LCAs)<n>It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals.<n>We propose a locally deployable framework that operates between users and LCAs, identifying and reformulating out-of-context information in user prompts.
arXiv Detail & Related papers (2025-02-22T09:05:39Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human [56.46355425175232]
We suggest sanitizing sensitive text using two common strategies used by humans.<n>We curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models.<n>Compared to the prior works on anonymization, the human-inspired approaches result in more natural rewrites.
arXiv Detail & Related papers (2024-06-06T05:07:44Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.