Urania: Differentially Private Insights into AI Use
- URL: http://arxiv.org/abs/2506.04681v2
- Date: Tue, 23 Sep 2025 23:45:44 GMT
- Title: Urania: Differentially Private Insights into AI Use
- Authors: Daogao Liu, Edith Cohen, Badih Ghazi, Peter Kairouz, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi, Adam Sealfon, Da Yu, Chiyuan Zhang,
- Abstract summary: $Urania$ provides end-to-end privacy protection by leveraging DP tools such as clustering, partition selection, and histogram-based summarization.<n>Results show the framework's ability to extract meaningful conversational insights while maintaining stringent user privacy.
- Score: 102.27238986985698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce $Urania$, a novel framework for generating insights about LLM chatbot interactions with rigorous differential privacy (DP) guarantees. The framework employs a private clustering mechanism and innovative keyword extraction methods, including frequency-based, TF-IDF-based, and LLM-guided approaches. By leveraging DP tools such as clustering, partition selection, and histogram-based summarization, $Urania$ provides end-to-end privacy protection. Our evaluation assesses lexical and semantic content preservation, pair similarity, and LLM-based metrics, benchmarking against a non-private Clio-inspired pipeline (Tamkin et al., 2024). Moreover, we develop a simple empirical privacy evaluation that demonstrates the enhanced robustness of our DP pipeline. The results show the framework's ability to extract meaningful conversational insights while maintaining stringent user privacy, effectively balancing data utility with privacy preservation.
Related papers
- Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks [13.326888254423901]
VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media.<n>We propose a novel protection method that jointly optimize privacy suppression and utility preservation under a visual consistency constraint.<n>Our method effectively reduces PAR below 25%, keeps NPAR above 88%, and generalizes well to unseen and paraphrased privacy questions.
arXiv Detail & Related papers (2025-12-20T08:08:50Z) - Subgraph Federated Learning via Spectral Methods [52.40322201034717]
FedLap is a novel framework that captures inter-node dependencies while ensuring privacy and scalability.<n>We provide a formal analysis of the privacy of FedLap, demonstrating that it preserves privacy.
arXiv Detail & Related papers (2025-10-29T16:22:32Z) - Privacy-Aware In-Context Learning for Large Language Models [12.605629953620495]
Large language models (LLMs) raise privacy concerns due to potential exposure of sensitive information.<n>We introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees.
arXiv Detail & Related papers (2025-09-17T01:50:32Z) - DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large Language Models [17.265217612125905]
DP-FedLoRA is a privacy-enhanced federated fine-tuning framework.<n>It integrates LoRA-based adaptation with differential privacy in a communication-efficient setting.<n>We show that DP-FedLoRA delivers competitive performance while offering strong privacy guarantees.
arXiv Detail & Related papers (2025-09-11T02:16:34Z) - The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization [53.51921540246166]
We show that Language Large Models (LLMs) can exploit the contextual vulnerability of DP-sanitized texts.<n>Experiments uncover a double-edged sword effect of LLM reconstructions on privacy and utility.<n>We propose recommendations for using data reconstruction as a post-processing step.
arXiv Detail & Related papers (2025-08-26T12:22:45Z) - Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning [54.20871516148981]
We introduce the Communication-Efficient and Privacy-Adaptable Mechanism (CEPAM)<n>CEPAM achieves communication efficiency and privacy protection simultaneously.<n>We theoretically analyze the privacy guarantee of CEPAM and investigate the trade-offs among user privacy and accuracy of CEPAM.
arXiv Detail & Related papers (2025-01-21T11:16:05Z) - Privacy-Preserving Dynamic Assortment Selection [4.399892832075127]
This paper presents a novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model.
Our approach integrates noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection.
arXiv Detail & Related papers (2024-10-29T19:28:01Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - A Framework for Managing Multifaceted Privacy Leakage While Optimizing Utility in Continuous LBS Interactions [0.0]
We present several novel contributions aimed at advancing the understanding and management of privacy leakage in LBS.
Our contributions provides a more comprehensive framework for analyzing privacy concerns across different facets of location-based interactions.
arXiv Detail & Related papers (2024-04-20T15:20:01Z) - Safeguarding Data in Multimodal AI: A Differentially Private Approach to
CLIP Training [15.928338716118697]
We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model.
Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets.
arXiv Detail & Related papers (2023-06-13T23:32:09Z) - Theoretically Principled Federated Learning for Balancing Privacy and
Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters.
It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Federated Deep Learning with Bayesian Privacy [28.99404058773532]
Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users.
Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads.
Deep learning with Differential Privacy (DP) was implemented as a practical learning algorithm at a manageable cost in complexity.
arXiv Detail & Related papers (2021-09-27T12:48:40Z) - Antipodes of Label Differential Privacy: PATE and ALIBI [2.2761657094500682]
We consider the privacy-preserving machine learning (ML) setting where the trained model must satisfy differential privacy (DP)
We propose two novel approaches based on, respectively, the Laplace mechanism and the PATE framework.
We show how to achieve very strong privacy levels in some regimes, with our adaptation of the PATE framework.
arXiv Detail & Related papers (2021-06-07T08:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.