A Differentially Private Weighted Empirical Risk Minimization Procedure
and its Application to Outcome Weighted Learning
- URL: http://arxiv.org/abs/2307.13127v1
- Date: Mon, 24 Jul 2023 21:03:25 GMT
- Title: A Differentially Private Weighted Empirical Risk Minimization Procedure
and its Application to Outcome Weighted Learning
- Authors: Spencer Giddens, Yiwang Zhou, Kevin R. Krull, Tara M. Brinkman, Peter
X.K. Song, Fang Liu
- Abstract summary: We propose the first differentially private wERM algorithm, backed by a rigorous theoretical proof of its DP guarantees.
We evaluate the performance of the DP-wERM application to weighted learning (OWL) in a simulation study and in a real clinical trial of melatonin for sleep health.
- Score: 5.025486694392673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is commonplace to use data containing personal information to build
predictive models in the framework of empirical risk minimization (ERM). While
these models can be highly accurate in prediction, results obtained from these
models with the use of sensitive data may be susceptible to privacy attacks.
Differential privacy (DP) is an appealing framework for addressing such data
privacy issues by providing mathematically provable bounds on the privacy loss
incurred when releasing information from sensitive data. Previous work has
primarily concentrated on applying DP to unweighted ERM. We consider an
important generalization to weighted ERM (wERM). In wERM, each individual's
contribution to the objective function can be assigned varying weights. In this
context, we propose the first differentially private wERM algorithm, backed by
a rigorous theoretical proof of its DP guarantees under mild regularity
conditions. Extending the existing DP-ERM procedures to wERM paves a path to
deriving privacy-preserving learning methods for individualized treatment
rules, including the popular outcome weighted learning (OWL). We evaluate the
performance of the DP-wERM application to OWL in a simulation study and in a
real clinical trial of melatonin for sleep health. All empirical results
demonstrate the viability of training OWL models via wERM with DP guarantees
while maintaining sufficiently useful model performance. Therefore, we
recommend practitioners consider implementing the proposed privacy-preserving
OWL procedure in real-world scenarios involving sensitive data.
Related papers
- Differential Privacy in Machine Learning: From Symbolic AI to LLMs [49.1574468325115]
Differential privacy provides a formal framework to mitigate privacy risks.<n>It ensures that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm.
arXiv Detail & Related papers (2025-06-13T11:30:35Z) - SafeSynthDP: Leveraging Large Language Models for Privacy-Preserving Synthetic Data Generation Using Differential Privacy [0.0]
We investigate capability of Large Language Models (Ms) to generate synthetic datasets with Differential Privacy (DP) mechanisms.
Our approach incorporates DP-based noise injection methods, including Laplace and Gaussian distributions, into the data generation process.
We then evaluate the utility of these DP-enhanced synthetic datasets by comparing the performance of ML models trained on them against models trained on the original data.
arXiv Detail & Related papers (2024-12-30T01:10:10Z) - Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models [2.3281513013731145]
Fine-tuning large language models (LLMs) for specific tasks introduces privacy risks, as models may inadvertently memorise and leak sensitive training data.
Differential Privacy (DP) offers a solution to mitigate these risks, but introduces significant computational and performance trade-offs.
We show that PEFT methods achieve comparable performance to standard fine-tuning while requiring fewer parameters and significantly reducing privacy leakage.
arXiv Detail & Related papers (2024-11-24T13:17:36Z) - Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - LLM-based Privacy Data Augmentation Guided by Knowledge Distillation
with a Distribution Tutor for Medical Text Classification [67.92145284679623]
We propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost.
We theoretically analyze our model's privacy protection and empirically verify our model.
arXiv Detail & Related papers (2024-02-26T11:52:55Z) - Differentially Private Distributed Inference [2.4401219403555814]
Healthcare centers collaborating on clinical trials must balance knowledge sharing with safeguarding sensitive patient data.
We address this challenge by using differential privacy (DP) to control information leakage.
Agents update belief statistics via log-linear rules, and DP noise provides plausible deniability and rigorous performance guarantees.
arXiv Detail & Related papers (2024-02-13T01:38:01Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Differentially Private Estimation of Heterogeneous Causal Effects [9.355532300027727]
We introduce a general meta-algorithm for estimating conditional average treatment effects (CATE) with differential privacy guarantees.
Our meta-algorithm can work with simple, single-stage CATE estimators such as S-learner and more complex multi-stage estimators such as DR and R-learner.
arXiv Detail & Related papers (2022-02-22T17:21:18Z) - DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in
Machine Learning [3.822543555265593]
Differential Privacy (DP) has emerged as a rigorous formalism to reason about privacy leakage.
In machine learning (ML), DP has been employed to limit/disclosure of training examples.
For deep neural networks, gradient perturbation results in lowest privacy leakage.
arXiv Detail & Related papers (2021-12-24T08:40:28Z) - Differentially private federated deep learning for multi-site medical
image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer.
Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models.
However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z) - PEARL: Data Synthesis via Private Embeddings and Adversarial
Reconstruction Learning [1.8692254863855962]
We propose a new framework of data using deep generative models in a differentially private manner.
Within our framework, sensitive data are sanitized with rigorous privacy guarantees in a one-shot fashion.
Our proposal has theoretical guarantees of performance, and empirical evaluations on multiple datasets show that our approach outperforms other methods at reasonable levels of privacy.
arXiv Detail & Related papers (2021-06-08T18:00:01Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Privacy-preserving medical image analysis [53.4844489668116]
We present PriMIA, a software framework designed for privacy-preserving machine learning (PPML) in medical imaging.
We show significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets.
We empirically evaluate the framework's security against a gradient-based model inversion attack.
arXiv Detail & Related papers (2020-12-10T13:56:00Z) - Anonymizing Data for Privacy-Preserving Federated Learning [3.3673553810697827]
We propose the first syntactic approach for offering privacy in the context of federated learning.
Our approach aims to maximize utility or model performance, while supporting a defensible level of privacy.
We perform a comprehensive empirical evaluation on two important problems in the healthcare domain, using real-world electronic health data of 1 million patients.
arXiv Detail & Related papers (2020-02-21T02:30:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.