Evaluating the Impact of Local Differential Privacy on Utility Loss via
Influence Functions
- URL: http://arxiv.org/abs/2309.08678v1
- Date: Fri, 15 Sep 2023 18:08:24 GMT
- Title: Evaluating the Impact of Local Differential Privacy on Utility Loss via
Influence Functions
- Authors: Alycia N. Carey, Minh-Hao Van, and Xintao Wu
- Abstract summary: We demonstrate the ability of influence functions to offer insight into how a specific privacy parameter value will affect a model's test loss.
Our proposed method allows a data curator to select the privacy parameter best aligned with their allowed privacy-utility trade-off.
- Score: 11.504012974208466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How to properly set the privacy parameter in differential privacy (DP) has
been an open question in DP research since it was first proposed in 2006. In
this work, we demonstrate the ability of influence functions to offer insight
into how a specific privacy parameter value will affect a model's test loss in
the randomized response-based local DP setting. Our proposed method allows a
data curator to select the privacy parameter best aligned with their allowed
privacy-utility trade-off without requiring heavy computation such as extensive
model retraining and data privatization. We consider multiple common
randomization scenarios, such as performing randomized response over the
features, and/or over the labels, as well as the more complex case of applying
a class-dependent label noise correction method to offset the noise incurred by
randomization. Further, we provide a detailed discussion over the computational
complexity of our proposed approach inclusive of an empirical analysis. Through
empirical evaluations we show that for both binary and multi-class settings,
influence functions are able to approximate the true change in test loss that
occurs when randomized response is applied over features and/or labels with
small mean absolute error, especially in cases where noise correction methods
are applied.
Related papers
- Federated Nonparametric Hypothesis Testing with Differential Privacy Constraints: Optimal Rates and Adaptive Tests [5.3595271893779906]
Federated learning has attracted significant recent attention due to its applicability across a wide range of settings where data is collected and analyzed across disparate locations.
We study federated nonparametric goodness-of-fit testing in the white-noise-with-drift model under distributed differential privacy (DP) constraints.
arXiv Detail & Related papers (2024-06-10T19:25:19Z) - Noise Variance Optimization in Differential Privacy: A Game-Theoretic Approach Through Per-Instance Differential Privacy [7.264378254137811]
Differential privacy (DP) can measure privacy loss by observing the changes in the distribution caused by the inclusion of individuals in the target dataset.
DP has been prominent in safeguarding datasets in machine learning in industry giants like Apple and Google.
We propose per-instance DP (pDP) as a constraint, measuring privacy loss for each data instance and optimizing noise tailored to individual instances.
arXiv Detail & Related papers (2024-04-24T06:51:16Z) - Mitigating LLM Hallucinations via Conformal Abstention [70.83870602967625]
We develop a principled procedure for determining when a large language model should abstain from responding in a general domain.
We leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate)
Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets.
arXiv Detail & Related papers (2024-04-04T11:32:03Z) - Causal Inference with Differentially Private (Clustered) Outcomes [16.166525280886578]
Estimating causal effects from randomized experiments is only feasible if participants agree to reveal their responses.
We suggest a new differential privacy mechanism, Cluster-DP, which leverages any given cluster structure.
We show that, depending on an intuitive measure of cluster quality, we can improve the variance loss while maintaining our privacy guarantees.
arXiv Detail & Related papers (2023-08-02T05:51:57Z) - Enabling Trade-offs in Privacy and Utility in Genomic Data Beacons and
Summary Statistics [26.99521354120141]
We introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy.
In the first, an attacker applies a likelihood-ratio test to make membership-inference claims.
In the second, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals.
arXiv Detail & Related papers (2023-01-11T19:16:13Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Partial sensitivity analysis in differential privacy [58.730520380312676]
We investigate the impact of each input feature on the individual's privacy loss.
We experimentally evaluate our approach on queries over private databases.
We also explore our findings in the context of neural network training on synthetic data.
arXiv Detail & Related papers (2021-09-22T08:29:16Z) - Bias and Variance of Post-processing in Differential Privacy [53.29035917495491]
Post-processing immunity is a fundamental property of differential privacy.
It is often argued that post-processing may introduce bias and increase variance.
This paper takes a first step towards understanding the properties of post-processing.
arXiv Detail & Related papers (2020-10-09T02:12:54Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.