The Challenge of Differentially Private Screening Rules
- URL: http://arxiv.org/abs/2303.10303v1
- Date: Sat, 18 Mar 2023 01:45:34 GMT
- Title: The Challenge of Differentially Private Screening Rules
- Authors: Amol Khanna, Fred Lu, Edward Raff
- Abstract summary: We develop the first differentially private screening rule for linear and logistic regression.
In doing so, we discover difficulties in the task of making a useful private screening rule due to the amount of noise added to ensure privacy.
- Score: 32.18582226044492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linear $L_1$-regularized models have remained one of the simplest and most
effective tools in data analysis, especially in information retrieval problems
where n-grams over text with TF-IDF or Okapi feature values are a strong and
easy baseline. Over the past decade, screening rules have risen in popularity
as a way to reduce the runtime for producing the sparse regression weights of
$L_1$ models. However, despite the increasing need of privacy-preserving models
in information retrieval, to the best of our knoweledge, no differentially
private screening rule exists. In this paper, we develop the first
differentially private screening rule for linear and logistic regression. In
doing so, we discover difficulties in the task of making a useful private
screening rule due to the amount of noise added to ensure privacy. We provide
theoretical arguments and experimental evidence that this difficulty arises
from the screening step itself and not the private optimizer. Based on our
results, we highlight that developing an effective private $L_1$ screening
method is an open problem in the differential privacy literature.
Related papers
- Differentially Private Iterative Screening Rules for Linear Regression [45.50668718813776]
In this paper, we develop the first private screening rule for linear regression.
We find that this screening rule is too strong: it screens too many coefficients as a result of the private screening step.
However, a weakened implementation of private screening reduces overscreening and improves performance.
arXiv Detail & Related papers (2025-02-25T19:06:19Z) - Near-Optimal Private Learning in Linear Contextual Bandits [61.39697409886124]
We analyze the problem of private learning in generalized linear contextual bandits.
Our results imply that joint privacy is almost "for free" in all the settings we consider.
arXiv Detail & Related papers (2025-02-18T18:35:24Z) - Optimized Tradeoffs for Private Prediction with Majority Ensembling [59.99331405291337]
We introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm.
DaRRM is parameterized by a data-dependent noise function $gamma$, and enables efficient utility optimization over the class of all private algorithms.
We show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility.
arXiv Detail & Related papers (2024-11-27T00:48:48Z) - FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits [8.908421753758475]
High dimensional sparse linear bandits serve as an efficient model for sequential decision-making problems.
Motivated by data privacy concerns, we study the joint differentially private high dimensional sparse linear bandits.
We show that FLIPHAT achieves optimal regret in terms of privacy parameters.
arXiv Detail & Related papers (2024-05-22T22:19:12Z) - On the Complexity of Differentially Private Best-Arm Identification with
Fixed Confidence [16.295693624977563]
We study the problem of Best Arm Identification with fixed confidence under $epsilon$-global Differential Privacy.
Our lower bound suggests the existence of two privacy regimes depending on the privacy budget.
We propose AdaP-TT, an $epsilon$-global DP variant of the Top Two algorithm.
arXiv Detail & Related papers (2023-09-05T13:07:25Z) - Differentially Private Statistical Inference through $\beta$-Divergence
One Posterior Sampling [2.8544822698499255]
We propose a posterior sampling scheme from a generalised posterior targeting the minimisation of the $beta$-divergence between the model and the data generating process.
This provides private estimation that is generally applicable without requiring changes to the underlying model.
We show that $beta$D-Bayes produces more precise inference estimation for the same privacy guarantees.
arXiv Detail & Related papers (2023-07-11T12:00:15Z) - Generating Private Synthetic Data with Genetic Algorithms [29.756119782419955]
We study the problem of efficiently generating differentially private synthetic data that approximates the statistical properties of an underlying sensitive dataset.
We propose Private-GSD, a private genetic algorithm based on zeroth-order optimizations that do not require modifying the original objective.
We show that Private-GSD outperforms the state-of-the-art methods on non-differential queries while matching accuracy in approximating differentiable ones.
arXiv Detail & Related papers (2023-06-05T21:19:37Z) - On Differential Privacy and Adaptive Data Analysis with Bounded Space [76.10334958368618]
We study the space complexity of the two related fields of differential privacy and adaptive data analysis.
We show that there exists a problem P that requires exponentially more space to be solved efficiently with differential privacy.
The line of work on adaptive data analysis focuses on understanding the number of samples needed for answering a sequence of adaptive queries.
arXiv Detail & Related papers (2023-02-11T14:45:31Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Private Domain Adaptation from a Public Source [48.83724068578305]
We design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data.
Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms.
arXiv Detail & Related papers (2022-08-12T06:52:55Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Individual Privacy Accounting via a Renyi Filter [33.65665839496798]
We give a method for tighter privacy loss accounting based on the value of a personalized privacy loss estimate for each individual.
Our filter is simpler and tighter than the known filter for $(epsilon,delta)$-differential privacy by Rogers et al.
arXiv Detail & Related papers (2020-08-25T17:49:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.