Related papers: Enhancing Differentially Private Linear Regression via Public Second-Moment

Enhancing Differentially Private Linear Regression via Public Second-Moment

URL: http://arxiv.org/abs/2508.18037v1
Date: Mon, 25 Aug 2025 13:55:46 GMT
Title: Enhancing Differentially Private Linear Regression via Public Second-Moment
Authors: Zilong Cao, Hai Zhang,
Abstract summary: We propose a novel method that involves transforming private data using the public second-moment matrix to compute a transformed SSP-OLSE.<n>We derive theoretical error bounds about our method and the standard SSP-OLSE to the non-DP OLSE, which reveal the improved robustness and accuracy achieved by our approach.
Score: 2.729099903480711
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Leveraging information from public data has become increasingly crucial in enhancing the utility of differentially private (DP) methods. Traditional DP approaches often require adding noise based solely on private data, which can significantly degrade utility. In this paper, we address this limitation in the context of the ordinary least squares estimator (OLSE) of linear regression based on sufficient statistics perturbation (SSP) under the unbounded data assumption. We propose a novel method that involves transforming private data using the public second-moment matrix to compute a transformed SSP-OLSE, whose second-moment matrix yields a better condition number and improves the OLSE accuracy and robustness. We derive theoretical error bounds about our method and the standard SSP-OLSE to the non-DP OLSE, which reveal the improved robustness and accuracy achieved by our approach. Experiments on synthetic and real-world datasets demonstrate the utility and effectiveness of our method.

Related papers

Differentially Private Truncation of Unbounded Data via Public Second Moments [4.662174186673445]
We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix.<n>PMT substantially improves the accuracy and stability of differential privacy models.
arXiv Detail & Related papers (2026-02-25T12:21:30Z)
Classification Under Local Differential Privacy with Model Reversal and Model Averaging [5.178896452202825]
Local differential privacy (LDP) has become a central topic in data privacy research.<n>We propose novel techniques specifically designed for LDP to improve classification performance without compromising privacy.<n> Empirical results on both simulated and real-world datasets show substantial improvements in classification accuracy.
arXiv Detail & Related papers (2026-02-05T15:52:34Z)
Dual Utilization of Perturbation for Stream Data Publication under Local Differential Privacy [10.07017446059039]
Local differential privacy (LDP) has emerged as a promising standard.<n>Applying LDP to stream data presents significant challenges, as stream data often involves a large or even infinite number of values.<n>We introduce the Iterative Perturbation IPP method, which utilizes current perturbed results to calibrate the subsequent perturbation process.<n>We prove that these three algorithms satisfy $w$-event differential privacy while significantly improving utility.
arXiv Detail & Related papers (2025-04-21T09:51:18Z)
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.<n>Current approaches typically address this issue through online sampling from the target policy.<n>We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
This paper introduces a novel approach to online inference in high-dimensional generalized linear models.<n>Our method operates in a single-pass mode, making it different from existing methods that require full dataset access or large-dimensional summary statistics storage.<n>The core of our methodological innovation lies in an adaptive descent algorithm tailored for dynamic objective functions, coupled with a novel online debiasing procedure.
arXiv Detail & Related papers (2024-05-28T15:36:48Z)
Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections. We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer. The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z)
Latent-Variable Advantage-Weighted Policy Optimization for Offline RL [70.01851346635637]
offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions. In practice, offline datasets are often heterogeneous, i.e., collected in a variety of scenarios. We propose to leverage latent-variable policies that can represent a broader class of policy distributions. Our method improves the average performance of the next best-performing offline reinforcement learning methods by 49% on heterogeneous datasets.
arXiv Detail & Related papers (2022-03-16T21:17:03Z)
Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users. An adversary may still be able to infer the private training data by attacking the released model. Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.