Related papers: $(ε, δ)$-Differentially Private Partial Least Squares Regression

$(ε, δ)$-Differentially Private Partial Least Squares Regression

URL: http://arxiv.org/abs/2412.09164v1
Date: Thu, 12 Dec 2024 10:49:55 GMT
Title: $(ε, δ)$-Differentially Private Partial Least Squares Regression
Authors: Ramin Nikzad-Langerodi, Mohit Kumar, Du Nguyen Duy, Mahtab Alghasi,
Abstract summary: We propose an $(epsilon, delta)$-differentially private PLS (edPLS) algorithm to ensure the privacy of the data underlying the model.<n> Experimental results demonstrate that edPLS effectively renders privacy attacks, aimed at recovering unique sources of variability in the training data.
Score: 1.8666451604540077
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: As data-privacy requirements are becoming increasingly stringent and statistical models based on sensitive data are being deployed and used more routinely, protecting data-privacy becomes pivotal. Partial Least Squares (PLS) regression is the premier tool for building such models in analytical chemistry, yet it does not inherently provide privacy guarantees, leaving sensitive (training) data vulnerable to privacy attacks. To address this gap, we propose an $(\epsilon, \delta)$-differentially private PLS (edPLS) algorithm, which integrates well-studied and theoretically motivated Gaussian noise-adding mechanisms into the PLS algorithm to ensure the privacy of the data underlying the model. Our approach involves adding carefully calibrated Gaussian noise to the outputs of four key functions in the PLS algorithm: the weights, scores, $X$-loadings, and $Y$-loadings. The noise variance is determined based on the global sensitivity of each function, ensuring that the privacy loss is controlled according to the $(\epsilon, \delta)$-differential privacy framework. Specifically, we derive the sensitivity bounds for each function and use these bounds to calibrate the noise added to the model components. Experimental results demonstrate that edPLS effectively renders privacy attacks, aimed at recovering unique sources of variability in the training data, ineffective. Application of edPLS to the NIR corn benchmark dataset shows that the root mean squared error of prediction (RMSEP) remains competitive even at strong privacy levels (i.e., $\epsilon=1$), given proper pre-processing of the corresponding spectra. These findings highlight the practical utility of edPLS in creating privacy-preserving multivariate calibrations and for the analysis of their privacy-utility trade-offs.

Related papers

Evaluation of Differential Privacy Mechanisms on Federated Learning [0.0]
Federated learning is distributed across several clients without disclosing raw data.<n> Differential Privacy (DP) is a technique to protect sensitive data by adding noise to model updates.<n>This work implements DP methods using Laplace and Gaussian mechanisms with an adaptive privacy budget.
arXiv Detail & Related papers (2025-10-09T11:32:36Z)
Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression [22.733602577854825]
We study instrumental variable regression (IVaR) under differential privacy constraints.<n>We propose a noisy two-state gradient descent algorithm that ensures differential privacy by injecting carefully calibrated noise into the gradient updates.
arXiv Detail & Related papers (2025-09-26T18:02:58Z)
Privacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation [26.573578326262307]
Privacy-Aware Decoding (PAD) is a lightweight, inference-time defense that adaptively injects calibrated Gaussian noise into token logits during generation.<n>PAD integrates confidence-based screening to selectively protect high-risk tokens, efficient sensitivity estimation to minimize unnecessary noise, and context-aware noise calibration to balance privacy with generation quality.<n>Our work takes an important step toward mitigating privacy risks in RAG via decoding strategies, paving the way for universal and scalable privacy solutions in sensitive domains.
arXiv Detail & Related papers (2025-08-05T05:22:13Z)
Differentially Private Distribution Release of Gaussian Mixture Models via KL-Divergence Minimization [5.615206798152645]
We introduce a DP mechanism that adds carefully calibrated random perturbations to the GMM parameters.<n>Our approach achieves strong privacy guarantees while maintaining high utility.
arXiv Detail & Related papers (2025-06-04T00:40:24Z)
Dual Utilization of Perturbation for Stream Data Publication under Local Differential Privacy [10.07017446059039]
Local differential privacy (LDP) has emerged as a promising standard. Applying LDP to stream data presents significant challenges, as stream data often involves a large or even infinite number of values. We introduce the Iterative Perturbation IPP method, which utilizes current perturbed results to calibrate the subsequent perturbation process. We prove that these three algorithms satisfy $w$-event differential privacy while significantly improving utility.
arXiv Detail & Related papers (2025-04-21T09:51:18Z)
Federated Learning with Differential Privacy: An Utility-Enhanced Approach [12.614480013684759]
Federated learning has emerged as an attractive approach to protect data privacy by eliminating the need for sharing clients' data. Recent studies have shown that federated learning alone does not guarantee privacy, as private data may still be inferred from the uploaded parameters to the central server. We present a modification to these vanilla differentially private algorithms based on a Haar wavelet transformation step and a novel noise injection scheme that significantly lowers the bound of the noise variance.
arXiv Detail & Related papers (2025-03-27T04:48:29Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Calibrating Practical Privacy Risks for Differentially Private Machine Learning [5.363664265121231]
We study the approaches that can lower the attacking success rate to allow for more flexible privacy budget settings in model training. We find that by selectively suppressing privacy-sensitive features, we can achieve lower ASR values without compromising application-specific data utility.
arXiv Detail & Related papers (2024-10-30T03:52:01Z)
Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z)
Stronger Privacy Amplification by Shuffling for R\'enyi and Approximate Differential Privacy [43.33288245778629]
A key result in this model is that randomly shuffling locally randomized data amplifies differential privacy guarantees. Such amplification implies substantially stronger privacy guarantees for systems in which data is contributed anonymously. In this work, we improve the state of the art privacy amplification by shuffling results both theoretically and numerically.
arXiv Detail & Related papers (2022-08-09T08:13:48Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)
A Differentially Private Framework for Deep Learning with Convexified Loss Functions [4.059849656394191]
Differential privacy (DP) has been applied in deep learning for preserving privacy of the underlying training sets. Existing DP practice falls into three categories - objective perturbation, gradient perturbation and output perturbation. We propose a novel output perturbation framework by injecting DP noise into a randomly sampled neuron.
arXiv Detail & Related papers (2022-04-03T11:10:05Z)
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters. We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z)
RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection. However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information. We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users. An adversary may still be able to infer the private training data by attacking the released model. Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.