Regression with Label Differential Privacy
- URL: http://arxiv.org/abs/2212.06074v3
- Date: Wed, 4 Oct 2023 18:45:53 GMT
- Title: Regression with Label Differential Privacy
- Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin
Manurangsi, Avinash V Varadarajan, Chiyuan Zhang
- Abstract summary: We derive a label DP randomization mechanism that is optimal under a given regression loss function.
We prove that the optimal mechanism takes the form of a "randomized response on bins"
- Score: 64.21020761920322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the task of training regression models with the guarantee of label
differential privacy (DP). Based on a global prior distribution on label
values, which could be obtained privately, we derive a label DP randomization
mechanism that is optimal under a given regression loss function. We prove that
the optimal mechanism takes the form of a "randomized response on bins", and
propose an efficient algorithm for finding the optimal bin values. We carry out
a thorough experimental evaluation on several datasets demonstrating the
efficacy of our algorithm.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Optimal Unbiased Randomizers for Regression with Label Differential
Privacy [61.63619647307816]
We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP)
We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets.
arXiv Detail & Related papers (2023-12-09T19:58:34Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift [1.3597551064547502]
We learn a regression function with small mean squared error over a target distribution, based on unlabeled data from there and labeled data that may have a different feature distribution.
We propose to split the labeled data into two subsets, and conduct kernel ridge regression on them separately to obtain a collection of candidate models and an imputation model.
Our estimator achieves the minimax optimal error rate up to a polylogarithmic factor, and we find that using pseudo-labels for model selection does not significantly hinder performance.
arXiv Detail & Related papers (2023-02-20T18:46:12Z) - Near Optimal Private and Robust Linear Regression [47.2888113094367]
We propose a variant of the popular differentially private gradient descent (DP-SGD) algorithm with two innovations.
Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(varepsilon,delta)$-DP and robustness.
arXiv Detail & Related papers (2023-01-30T20:33:26Z) - Feature Selection via the Intervened Interpolative Decomposition and its
Application in Diversifying Quantitative Strategies [4.913248451323163]
We propose a probabilistic model for computing an interpolative decomposition (ID) in which each column of the observed matrix has its own priority or importance.
We evaluate the proposed models on real-world datasets, including ten Chinese A-share stocks.
arXiv Detail & Related papers (2022-09-29T03:36:56Z) - Sparse Bayesian Learning via Stepwise Regression [1.2691047660244335]
We propose a coordinate ascent algorithm for SBL termed Relevance Matching Pursuit (RMP)
As its noise variance parameter goes to zero, RMP exhibits a surprising connection to Stepwise Regression.
We derive novel guarantees for Stepwise Regression algorithms, which also shed light on RMP.
arXiv Detail & Related papers (2021-06-11T00:20:27Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.