The Hidden Cost of Correlation: Rethinking Privacy Leakage in Local Differential Privacy
- URL: http://arxiv.org/abs/2508.12539v1
- Date: Mon, 18 Aug 2025 00:34:04 GMT
- Title: The Hidden Cost of Correlation: Rethinking Privacy Leakage in Local Differential Privacy
- Authors: Sandaru Jayawardana, Sennur Ulukus, Ming Ding, Kanchana Thilakarathna,
- Abstract summary: Local differential privacy (LDP) has emerged as a promising paradigm for privacy-preserving data collection in distributed systems.<n>Recent work has highlighted that correlation-induced privacy leakage (CPL) plays a critical role in shaping the privacy-utility trade-off under LDP.
- Score: 35.501140141067395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local differential privacy (LDP) has emerged as a promising paradigm for privacy-preserving data collection in distributed systems, where users contribute multi-dimensional records with potentially correlated attributes. Recent work has highlighted that correlation-induced privacy leakage (CPL) plays a critical role in shaping the privacy-utility trade-off under LDP, especially when correlations exist among attributes. Nevertheless, it remains unclear to what extent the prevailing assumptions and proposed solutions are valid and how significant CPL is in real-world data. To address this gap, we first perform a comprehensive statistical analysis of five widely used LDP mechanisms -- GRR, RAPPOR, OUE, OLH and Exponential mechanism -- to assess CPL across four real-world datasets. We identify that many primary assumptions and metrics in current approaches fall short of accurately characterising these leakages. Moreover, current studies have been limited to a set of pure LDP (i.e., {\delta = 0}) mechanisms. In response, we develop the first algorithmic framework to theoretically quantify CPL for any general approximated LDP (({\varepsilon},{\delta})-LDP) mechanism. We validate our theoretical results against empirical statistical results and provide a theoretical explanation for the observed statistical patterns. Finally, we propose two novel benchmarks to validate correlation analysis algorithms and evaluate the utility vs CPL of LDP mechanisms. Further, we demonstrate how these findings can be applied to achieve an efficient privacy-utility trade-off in real-world data governance.
Related papers
- Efficient Thought Space Exploration through Strategic Intervention [54.35208611253168]
We propose a novel Hint-Practice Reasoning (HPR) framework that operationalizes this insight through two synergistic components.<n>The framework's core innovation lies in Distributional Inconsistency Reduction (DIR), which dynamically identifies intervention points.<n> Experiments across arithmetic and commonsense reasoning benchmarks demonstrate HPR's state-of-the-art efficiency-accuracy tradeoffs.
arXiv Detail & Related papers (2025-11-13T07:26:01Z) - Fundamental Limit of Discrete Distribution Estimation under Utility-Optimized Local Differential Privacy [14.980778567896593]
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP)<n>For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as non-preserving modifications of the block design mechanism known to be optimal under standard LDP constraints.<n>These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-trivial statistical inference.
arXiv Detail & Related papers (2025-09-29T01:41:36Z) - KV-Auditor: Auditing Local Differential Privacy for Correlated Key-Value Estimation [3.1960143210470973]
We propose KV-Auditor, a framework for auditing LDP-based key-value estimation mechanisms.<n>We classify state-of-the-art LDP key-value mechanisms into interactive and non-interactive types.<n>For interactive mechanisms, we design a segmentation strategy to capture incremental privacy leakage across iterations.
arXiv Detail & Related papers (2025-08-15T14:17:24Z) - Quantifying Classifier Utility under Local Differential Privacy [5.90975025491779]
Local differential privacy (LDP) provides a quantifiable privacy guarantee for personal data by introducing perturbation at the data source.<n>This paper presents a framework for theoretically quantifying classifier utility under LDP mechanisms.
arXiv Detail & Related papers (2025-07-03T15:42:10Z) - Balancing Privacy and Utility in Correlated Data: A Study of Bayesian Differential Privacy [4.5885800765465135]
Privacy risks in differentially private (DP) systems increase significantly when data is correlated.<n>Given the ubiquity of dependencies in real-world databases, this oversight poses a critical challenge for privacy protections.<n>BDP extends DP to account for these correlations, yet current BDP mechanisms indicate notable utility loss, limiting its adoption.
arXiv Detail & Related papers (2025-06-26T14:25:44Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.<n>We propose a method called Stratified Prediction-Powered Inference (StratPPI)<n>We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation [73.2390735383842]
We introduce the first sample-efficient algorithm for LMDPs without any additional structural assumptions.
We show how these can be used to derive near-optimal guarantees of an optimistic exploration algorithm.
These results can be valuable for a wide range of interactive learning problems beyond LMDPs, and especially, for partially observed environments.
arXiv Detail & Related papers (2024-06-03T14:51:27Z) - A Simple and Practical Method for Reducing the Disparate Impact of
Differential Privacy [21.098175634158043]
Differentially private (DP) mechanisms have been deployed in a variety of high-impact social settings.
The impact of DP on utility can vary significantly among different sub-populations.
A simple way to reduce this disparity is with stratification.
arXiv Detail & Related papers (2023-12-18T21:19:35Z) - FedLAP-DP: Federated Learning by Sharing Differentially Private Loss Approximations [53.268801169075836]
We propose FedLAP-DP, a novel privacy-preserving approach for federated learning.
A formal privacy analysis demonstrates that FedLAP-DP incurs the same privacy costs as typical gradient-sharing schemes.
Our approach presents a faster convergence speed compared to typical gradient-sharing methods.
arXiv Detail & Related papers (2023-02-02T12:56:46Z) - False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.