Better Private Linear Regression Through Better Private Feature
Selection
- URL: http://arxiv.org/abs/2306.00920v1
- Date: Thu, 1 Jun 2023 17:21:10 GMT
- Title: Better Private Linear Regression Through Better Private Feature
Selection
- Authors: Travis Dick, Jennifer Gillenwater, Matthew Joseph
- Abstract summary: We introduce a differentially private feature selection method based on Kendall rank correlation.
We prove a utility guarantee for the setting where features are normally distributed.
We find that adding this private feature selection step before regression significantly broadens the applicability of plug-and-play'' private linear regression algorithms.
- Score: 18.884088349732973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing work on differentially private linear regression typically assumes
that end users can precisely set data bounds or algorithmic hyperparameters.
End users often struggle to meet these requirements without directly examining
the data (and violating privacy). Recent work has attempted to develop
solutions that shift these burdens from users to algorithms, but they struggle
to provide utility as the feature dimension grows. This work extends these
algorithms to higher-dimensional problems by introducing a differentially
private feature selection method based on Kendall rank correlation. We prove a
utility guarantee for the setting where features are normally distributed and
conduct experiments across 25 datasets. We find that adding this private
feature selection step before regression significantly broadens the
applicability of ``plug-and-play'' private linear regression algorithms at
little additional cost to privacy, computation, or decision-making by the end
user.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Feature Selection from Differentially Private Correlations [35.187113265093615]
High-dimensional regression can leak information about individual datapoints in a dataset.
We employ a correlations-based order statistic to choose important features from a dataset and privatize them.
We find that our method significantly outperforms the established baseline for private feature selection on many datasets.
arXiv Detail & Related papers (2024-08-20T13:54:07Z) - Differentially Private Domain Adaptation with Theoretical Guarantees [46.37771025567305]
In many applications, the labeled data at the labeled data's disposal is subject to privacy constraints and is relatively limited.
This is the modern problem of supervised domain adaptation from a public source to a private target domain.
We make use of a general learner to benefit from favorable theoretical learning guarantees.
arXiv Detail & Related papers (2023-06-15T04:03:06Z) - Generating Private Synthetic Data with Genetic Algorithms [29.756119782419955]
We study the problem of efficiently generating differentially private synthetic data that approximates the statistical properties of an underlying sensitive dataset.
We propose Private-GSD, a private genetic algorithm based on zeroth-order optimizations that do not require modifying the original objective.
We show that Private-GSD outperforms the state-of-the-art methods on non-differential queries while matching accuracy in approximating differentiable ones.
arXiv Detail & Related papers (2023-06-05T21:19:37Z) - Improved Differentially Private Regression via Gradient Boosting [38.09948758699131]
We give a new algorithm for private linear regression based on gradient boosting.
In addition to a comprehensive set of experiments, we give theoretical insights to explain this behavior.
arXiv Detail & Related papers (2023-03-06T19:20:52Z) - On Differential Privacy for Federated Learning in Wireless Systems with
Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference.
We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap.
Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z) - Private Domain Adaptation from a Public Source [48.83724068578305]
We design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data.
Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms.
arXiv Detail & Related papers (2022-08-12T06:52:55Z) - Optimal Algorithms for Mean Estimation under Local Differential Privacy [55.32262879188817]
We show that PrivUnit achieves the optimal variance among a large family of locally private randomizers.
We also develop a new variant of PrivUnit based on the Gaussian distribution which is more amenable to mathematical analysis and enjoys the same optimality guarantees.
arXiv Detail & Related papers (2022-05-05T06:43:46Z) - Differentially Private Query Release Through Adaptive Projection [19.449593001368193]
We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals.
Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation.
We find that our method outperforms existing algorithms in many cases, especially when the privacy budget is small or the query class is large.
arXiv Detail & Related papers (2021-03-11T12:43:18Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.