Related papers: Better Private Linear Regression Through Better Private Feature Selection

Better Private Linear Regression Through Better Private Feature Selection

URL: http://arxiv.org/abs/2306.00920v1
Date: Thu, 1 Jun 2023 17:21:10 GMT
Title: Better Private Linear Regression Through Better Private Feature Selection
Authors: Travis Dick, Jennifer Gillenwater, Matthew Joseph
Abstract summary: We introduce a differentially private feature selection method based on Kendall rank correlation. We prove a utility guarantee for the setting where features are normally distributed. We find that adding this private feature selection step before regression significantly broadens the applicability of plug-and-play'' private linear regression algorithms.
Score: 18.884088349732973
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing work on differentially private linear regression typically assumes that end users can precisely set data bounds or algorithmic hyperparameters. End users often struggle to meet these requirements without directly examining the data (and violating privacy). Recent work has attempted to develop solutions that shift these burdens from users to algorithms, but they struggle to provide utility as the feature dimension grows. This work extends these algorithms to higher-dimensional problems by introducing a differentially private feature selection method based on Kendall rank correlation. We prove a utility guarantee for the setting where features are normally distributed and conduct experiments across 25 datasets. We find that adding this private feature selection step before regression significantly broadens the applicability of ``plug-and-play'' private linear regression algorithms at little additional cost to privacy, computation, or decision-making by the end user.

Related papers

Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Optimized Tradeoffs for Private Prediction with Majority Ensembling [59.99331405291337]
We introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. DaRRM is parameterized by a data-dependent noise function $gamma$, and enables efficient utility optimization over the class of all private algorithms. We show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility.
arXiv Detail & Related papers (2024-11-27T00:48:48Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Feature Selection from Differentially Private Correlations [35.187113265093615]
High-dimensional regression can leak information about individual datapoints in a dataset. We employ a correlations-based order statistic to choose important features from a dataset and privatize them. We find that our method significantly outperforms the established baseline for private feature selection on many datasets.
arXiv Detail & Related papers (2024-08-20T13:54:07Z)
Differentially Private Domain Adaptation with Theoretical Guarantees [46.37771025567305]
In many applications, the labeled data at the labeled data's disposal is subject to privacy constraints and is relatively limited. This is the modern problem of supervised domain adaptation from a public source to a private target domain. We make use of a general learner to benefit from favorable theoretical learning guarantees.
arXiv Detail & Related papers (2023-06-15T04:03:06Z)
Generating Private Synthetic Data with Genetic Algorithms [29.756119782419955]
We study the problem of efficiently generating differentially private synthetic data that approximates the statistical properties of an underlying sensitive dataset. We propose Private-GSD, a private genetic algorithm based on zeroth-order optimizations that do not require modifying the original objective. We show that Private-GSD outperforms the state-of-the-art methods on non-differential queries while matching accuracy in approximating differentiable ones.
arXiv Detail & Related papers (2023-06-05T21:19:37Z)
Improved Differentially Private Regression via Gradient Boosting [38.09948758699131]
We give a new algorithm for private linear regression based on gradient boosting. In addition to a comprehensive set of experiments, we give theoretical insights to explain this behavior.
arXiv Detail & Related papers (2023-03-06T19:20:52Z)
On Differential Privacy for Federated Learning in Wireless Systems with Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference. We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap. Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z)
Private Domain Adaptation from a Public Source [48.83724068578305]
We design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data. Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms.
arXiv Detail & Related papers (2022-08-12T06:52:55Z)
Optimal Algorithms for Mean Estimation under Local Differential Privacy [55.32262879188817]
We show that PrivUnit achieves the optimal variance among a large family of locally private randomizers. We also develop a new variant of PrivUnit based on the Gaussian distribution which is more amenable to mathematical analysis and enjoys the same optimality guarantees.
arXiv Detail & Related papers (2022-05-05T06:43:46Z)
Differentially Private Query Release Through Adaptive Projection [19.449593001368193]
We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation. We find that our method outperforms existing algorithms in many cases, especially when the privacy budget is small or the query class is large.
arXiv Detail & Related papers (2021-03-11T12:43:18Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.