Related papers: Regression with Label Differential Privacy

Regression with Label Differential Privacy

URL: http://arxiv.org/abs/2212.06074v3
Date: Wed, 4 Oct 2023 18:45:53 GMT
Title: Regression with Label Differential Privacy
Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Avinash V Varadarajan, Chiyuan Zhang
Abstract summary: We derive a label DP randomization mechanism that is optimal under a given regression loss function. We prove that the optimal mechanism takes the form of a "randomized response on bins"
Score: 64.21020761920322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the task of training regression models with the guarantee of label differential privacy (DP). Based on a global prior distribution on label values, which could be obtained privately, we derive a label DP randomization mechanism that is optimal under a given regression loss function. We prove that the optimal mechanism takes the form of a "randomized response on bins", and propose an efficient algorithm for finding the optimal bin values. We carry out a thorough experimental evaluation on several datasets demonstrating the efficacy of our algorithm.

Related papers

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
From Randomized Response to Randomized Index: Answering Subset Counting Queries with Local Differential Privacy [27.59934932590226]
Local Differential Privacy (LDP) is the predominant privacy model for safeguarding individual data privacy. We propose an alternative approach -- instead of perturbing values, we apply randomization to indexes of values. Inspired by the deniability of randomized indexes, we present CRIAD for answering subset counting queries on set-value data.
arXiv Detail & Related papers (2025-04-24T13:08:11Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Optimal Unbiased Randomizers for Regression with Label Differential Privacy [61.63619647307816]
We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP) We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets.
arXiv Detail & Related papers (2023-12-09T19:58:34Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift [1.3597551064547502]
We learn a regression function with small mean squared error over a target distribution, based on unlabeled data from there and labeled data that may have a different feature distribution. We propose to split the labeled data into two subsets, and conduct kernel ridge regression on them separately to obtain a collection of candidate models and an imputation model. Our estimator achieves the minimax optimal error rate up to a polylogarithmic factor, and we find that using pseudo-labels for model selection does not significantly hinder performance.
arXiv Detail & Related papers (2023-02-20T18:46:12Z)
Near Optimal Private and Robust Linear Regression [47.2888113094367]
We propose a variant of the popular differentially private gradient descent (DP-SGD) algorithm with two innovations. Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(varepsilon,delta)$-DP and robustness.
arXiv Detail & Related papers (2023-01-30T20:33:26Z)
Feature Selection via the Intervened Interpolative Decomposition and its Application in Diversifying Quantitative Strategies [4.913248451323163]
We propose a probabilistic model for computing an interpolative decomposition (ID) in which each column of the observed matrix has its own priority or importance. We evaluate the proposed models on real-world datasets, including ten Chinese A-share stocks.
arXiv Detail & Related papers (2022-09-29T03:36:56Z)
Sparse Bayesian Learning via Stepwise Regression [1.2691047660244335]
We propose a coordinate ascent algorithm for SBL termed Relevance Matching Pursuit (RMP) As its noise variance parameter goes to zero, RMP exhibits a surprising connection to Stepwise Regression. We derive novel guarantees for Stepwise Regression algorithms, which also shed light on RMP.
arXiv Detail & Related papers (2021-06-11T00:20:27Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.