KV-Auditor: Auditing Local Differential Privacy for Correlated Key-Value Estimation
- URL: http://arxiv.org/abs/2508.11495v1
- Date: Fri, 15 Aug 2025 14:17:24 GMT
- Title: KV-Auditor: Auditing Local Differential Privacy for Correlated Key-Value Estimation
- Authors: Jingnan Xu, Leixia Wang, Xiaofeng Meng,
- Abstract summary: We propose KV-Auditor, a framework for auditing LDP-based key-value estimation mechanisms.<n>We classify state-of-the-art LDP key-value mechanisms into interactive and non-interactive types.<n>For interactive mechanisms, we design a segmentation strategy to capture incremental privacy leakage across iterations.
- Score: 3.1960143210470973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To protect privacy for data-collection-based services, local differential privacy (LDP) is widely adopted due to its rigorous theoretical bound on privacy loss. However, mistakes in complex theoretical analysis or subtle implementation errors may undermine its practical guarantee. To address this, auditing is crucial to confirm that LDP protocols truly protect user data. However, existing auditing methods, though, mainly target machine learning and federated learning tasks based on centralized differentially privacy (DP), with limited attention to LDP. Moreover, the few studies on LDP auditing focus solely on simple frequency estimation task for discrete data, leaving correlated key-value data - which requires both discrete frequency estimation for keys and continuous mean estimation for values - unexplored. To bridge this gap, we propose KV-Auditor, a framework for auditing LDP-based key-value estimation mechanisms by estimating their empirical privacy lower bounds. Rather than traditional LDP auditing methods that relies on binary output predictions, KV-Auditor estimates this lower bound by analyzing unbounded output distributions, supporting continuous data. Specifically, we classify state-of-the-art LDP key-value mechanisms into interactive and non-interactive types. For non-interactive mechanisms, we propose horizontal KV-Auditor for small domains with sufficient samples and vertical KV-Auditor for large domains with limited samples. For interactive mechanisms, we design a segmentation strategy to capture incremental privacy leakage across iterations. Finally, we perform extensive experiments to validate the effectiveness of our approach, offering insights for optimizing LDP-based key-value estimators.
Related papers
- Observational Auditing of Label Privacy [16.143689489883382]
Differential privacy (DP) auditing is essential for evaluating privacy guarantees in machine learning systems.<n>Existing auditing methods require modifying the training dataset -- for instance, by injecting out-of-distribution canaries or removing samples from training.<n>We introduce a novel observational auditing framework that leverages the inherent randomness of data distributions.
arXiv Detail & Related papers (2025-11-18T03:12:59Z) - Fundamental Limit of Discrete Distribution Estimation under Utility-Optimized Local Differential Privacy [14.980778567896593]
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP)<n>For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as non-preserving modifications of the block design mechanism known to be optimal under standard LDP constraints.<n>These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-trivial statistical inference.
arXiv Detail & Related papers (2025-09-29T01:41:36Z) - The Hidden Cost of Correlation: Rethinking Privacy Leakage in Local Differential Privacy [35.501140141067395]
Local differential privacy (LDP) has emerged as a promising paradigm for privacy-preserving data collection in distributed systems.<n>Recent work has highlighted that correlation-induced privacy leakage (CPL) plays a critical role in shaping the privacy-utility trade-off under LDP.
arXiv Detail & Related papers (2025-08-18T00:34:04Z) - Balancing Privacy and Utility in Correlated Data: A Study of Bayesian Differential Privacy [4.5885800765465135]
Privacy risks in differentially private (DP) systems increase significantly when data is correlated.<n>Given the ubiquity of dependencies in real-world databases, this oversight poses a critical challenge for privacy protections.<n>BDP extends DP to account for these correlations, yet current BDP mechanisms indicate notable utility loss, limiting its adoption.
arXiv Detail & Related papers (2025-06-26T14:25:44Z) - Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.<n>We propose a method called Stratified Prediction-Powered Inference (StratPPI)<n>We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Federated Experiment Design under Distributed Differential Privacy [31.06808163362162]
We focus on the rigorous protection of users' privacy while minimizing the trust toward service providers.
Although a vital component in modern A/B testing, private distributed experimentation has not previously been studied.
We show how these mechanisms can be scaled up to handle the very large number of participants commonly found in practice.
arXiv Detail & Related papers (2023-11-07T22:38:56Z) - Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective [4.5282933786221395]
We introduce the LDP-Auditor framework for empirically estimating the privacy loss of locally differentially private mechanisms.
We extensively explore the factors influencing the privacy audit, such as the impact of different encoding and perturbation functions.
We present a notable achievement of our LDP-Auditor framework, which is the discovery of a bug in a state-of-the-art LDP Python package.
arXiv Detail & Related papers (2023-09-04T13:29:19Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.