A proposal to increase data utility on Global Differential Privacy data based on data use predictions
- URL: http://arxiv.org/abs/2401.06601v1
- Date: Fri, 12 Jan 2024 14:34:30 GMT
- Title: A proposal to increase data utility on Global Differential Privacy data based on data use predictions
- Authors: Henry C. Nunes, Marlon P. da Silva, Charles V. Neu, Avelino F. Zorzo,
- Abstract summary: Our approach is based on predictions on how an analyst will use statistics released under DP protection.
This novel approach can potentially improve the utility of data without compromising privacy constraints.
- Score: 0.2999888908665658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents ongoing research focused on improving the utility of data protected by Global Differential Privacy(DP) in the scenario of summary statistics. Our approach is based on predictions on how an analyst will use statistics released under DP protection, so that a developer can optimise data utility on further usage of the data in the privacy budget allocation. This novel approach can potentially improve the utility of data without compromising privacy constraints. We also propose a metric that can be used by the developer to optimise the budget allocation process.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Privacy-preserving recommender system using the data collaboration analysis for distributed datasets [2.9061423802698565]
We establish a framework for privacy-preserving recommender systems using the data collaboration analysis of distributed datasets.
Numerical experiments with two public rating datasets demonstrate that our privacy-preserving method for rating prediction can improve the prediction accuracy for distributed datasets.
arXiv Detail & Related papers (2024-05-24T07:43:00Z) - Synergizing Privacy and Utility in Data Analytics Through Advanced Information Theorization [2.28438857884398]
We introduce three sophisticated algorithms: a Noise-Infusion Technique tailored for high-dimensional image data, a Variational Autoencoder (VAE) for robust feature extraction and an Expectation Maximization (EM) approach optimized for structured data privacy.
Our methods significantly reduce mutual information between sensitive attributes and transformed data, thereby enhancing privacy.
The research contributes to the field by providing a flexible and effective strategy for deploying privacy-preserving algorithms across various data types.
arXiv Detail & Related papers (2024-04-24T22:58:42Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Chained-DP: Can We Recycle Privacy Budget? [18.19895364709435]
We propose a novel Chained-DP framework enabling users to carry out data aggregation sequentially to recycle the privacy budget.
We show the mathematical nature of the sequential game, solve its Nash Equilibrium, and design an incentive mechanism with provable economic properties.
Our numerical simulation validates the effectiveness of Chained-DP, showing that it can significantly save privacy budget and lower estimation error compared to the traditional LDP mechanism.
arXiv Detail & Related papers (2023-09-12T08:07:59Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Spending Privacy Budget Fairly and Wisely [7.975975942400017]
Differentially private (DP) synthetic data generation is a practical method for improving access to data.
One issue inherent to DP is that the "privacy budget" is generally "spent" evenly across features in the data set.
We develop ensemble methods that distribute the privacy budget "wisely" to maximize predictive accuracy of models trained on DP data.
arXiv Detail & Related papers (2022-04-27T13:13:56Z) - Secure Bayesian Federated Analytics for Privacy-Preserving Trend
Detection [3.04585143845864]
Federated analytics can lead to better decision making for service provision, product development, and user experience.
We propose a Bayesian approach to trend detection in which the probability of a keyword being trendy, given a dataset, is computed via Bayes' Theorem.
We propose a protocol, named SAFE, for Bayesian federated analytics that offers sufficient privacy for production grade use cases.
arXiv Detail & Related papers (2021-07-28T20:52:28Z) - Enhancing User' s Income Estimation with Super-App Alternative Data [59.60094442546867]
It compares the performance of these alternative data sources with the performance of industry-accepted bureau income estimators.
Ultimately, this paper shows the incentive for financial institutions to seek to incorporate alternative data into constructing their risk profiles.
arXiv Detail & Related papers (2021-04-12T21:34:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.