Frequency Estimation of Correlated Multi-attribute Data under Local Differential Privacy
- URL: http://arxiv.org/abs/2507.17516v1
- Date: Wed, 23 Jul 2025 13:52:45 GMT
- Title: Frequency Estimation of Correlated Multi-attribute Data under Local Differential Privacy
- Authors: Shafizur Rahman Seeam, Ye Zheng, Yidan Hu,
- Abstract summary: Local Differential Privacy (LDP) is a powerful tool to protect user data privacy.<n>Existing LDP mechanisms either split the privacy budget across all attributes or treat each attribute independently.<n>We propose Correlated Randomized Response (Corr-RR), a novel LDP mechanism that leverages correlations among attributes to substantially improve utility.
- Score: 5.258253745672122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale data collection, from national censuses to IoT-enabled smart homes, routinely gathers dozens of attributes per individual. These multi-attribute datasets are vital for analytics but pose significant privacy risks. Local Differential Privacy (LDP) is a powerful tool to protect user data privacy by allowing users to locally perturb their records before releasing to an untrusted data aggregator. However, existing LDP mechanisms either split the privacy budget across all attributes or treat each attribute independently, ignoring natural inter-attribute correlations. This leads to excessive noise or fragmented budgets, resulting in significant utility loss, particularly in high-dimensional settings. To overcome these limitations, we propose Correlated Randomized Response (Corr-RR), a novel LDP mechanism that leverages correlations among attributes to substantially improve utility while maintaining rigorous LDP guarantees. Corr-RR allocates the full privacy budget to perturb a single, randomly selected attribute and reconstructs the remaining attributes using estimated interattribute dependencies, without incurring additional privacy cost. To enable this, Corr-RR operates in two phases: (1) a subset of users apply standard LDP mechanisms to estimate correlations, and (2) each remaining user perturbs one attribute and infers the others using the learned correlations. We theoretically prove that Corr-RR satisfies $\epsilon$-LDP, and extensive experiments on synthetic and real-world datasets demonstrate that Corr-RR consistently outperforms state-of-the-art LDP mechanisms, particularly in scenarios with many attributes and strong inter-attribute correlations.
Related papers
- Locally Differentially Private Frequency Estimation via Joint Randomized Response [14.398663407098836]
Local Differential Privacy (LDP) has been widely recognized as a powerful tool for providing a strong theoretical guarantee of data privacy to data contributors against an untrusted data collector.<n>Common to existing LDP mechanisms is an inherent trade-off between the level of privacy protection and data utility.<n>We propose a novel Joint Randomized Response (JRR) mechanism based on correlated data perturbations to achieve locally differentially private frequency estimation.
arXiv Detail & Related papers (2025-05-15T14:41:03Z) - Bipartite Randomized Response Mechanism for Local Differential Privacy [12.356030528988002]
We introduce an adaptive Local Privacy (LDP) mechanism called Bipartite Randomized Response (BRR)<n>We prove that for any utility function and any privacy level, solving the problem is equivalent to confirming how many high-utility data to be treated equally as the true data on release probability.<n>Our BRR significantly outperforms the state-of-the-art LDP mechanisms of both continuous and distributed types.
arXiv Detail & Related papers (2025-04-29T16:39:50Z) - Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z) - $(ε, δ)$-Differentially Private Partial Least Squares Regression [1.8666451604540077]
We propose an $(epsilon, delta)$-differentially private PLS (edPLS) algorithm to ensure the privacy of the data underlying the model.<n> Experimental results demonstrate that edPLS effectively renders privacy attacks, aimed at recovering unique sources of variability in the training data.
arXiv Detail & Related papers (2024-12-12T10:49:55Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - (Local) Differential Privacy has NO Disparate Impact on Fairness [4.157415305926584]
Local Differential Privacy (LDP) has gained widespread adoption in real-world applications.
This paper empirically studies the impact of collecting multiple sensitive attributes under LDP on fairness.
arXiv Detail & Related papers (2023-04-25T14:18:12Z) - On the Privacy-Robustness-Utility Trilemma in Distributed Learning [7.778461949427662]
We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines.
Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility.
arXiv Detail & Related papers (2023-02-09T17:24:18Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Calibrated Feature Decomposition for Generalizable Person
Re-Identification [82.64133819313186]
Calibrated Feature Decomposition (CFD) module focuses on improving the generalization capacity for person re-identification.
A calibrated-and-standardized Batch normalization (CSBN) is designed to learn calibrated person representation.
arXiv Detail & Related papers (2021-11-27T17:12:43Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.