Related papers: When Gradient Clipping Becomes a Control Mechanism for Differential Privacy in Deep Learning

When Gradient Clipping Becomes a Control Mechanism for Differential Privacy in Deep Learning

URL: http://arxiv.org/abs/2602.10584v1
Date: Wed, 11 Feb 2026 07:18:25 GMT
Title: When Gradient Clipping Becomes a Control Mechanism for Differential Privacy in Deep Learning
Authors: Mohammad Partohaghighi, Roummel Marcia, Bruce J. West, YangQuan Chen,
Abstract summary: Privacy-preserving training on sensitive data relies on gradient clipping and Gaussian noise.<n>Existing adaptive clipping methods often depend on per-example gradient norm statistics.<n>We propose a control-driven clipping strategy that adapts the threshold using a lightweight, weight-only spectral diagnostic computed from parameters.
Score: 3.8218584696400484
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Privacy-preserving training on sensitive data commonly relies on differentially private stochastic optimization with gradient clipping and Gaussian noise. The clipping threshold is a critical control knob: if set too small, systematic over-clipping induces optimization bias; if too large, injected noise dominates updates and degrades accuracy. Existing adaptive clipping methods often depend on per-example gradient norm statistics, adding computational overhead and introducing sensitivity to datasets and architectures. We propose a control-driven clipping strategy that adapts the threshold using a lightweight, weight-only spectral diagnostic computed from model parameters. At periodic probe steps, the method analyzes a designated weight matrix via spectral decomposition and estimates a heavy-tailed spectral indicator associated with training stability. This indicator is smoothed over time and fed into a bounded feedback controller that updates the clipping threshold multiplicatively in the log domain. Because the controller uses only parameters produced during privacy-preserving training, the resulting threshold updates are post-processing and do not increase privacy loss beyond that of the underlying DP optimizer under standard composition accounting.

Related papers

An Adaptive Differentially Private Federated Learning Framework with Bi-level Optimization [10.218291445871435]
Federated learning enables collaborative model training across distributed clients while preserving data privacy.<n>In practical deployments, device heterogeneity, non-independent, and identically distributed (Non-IID) data often lead to highly unstable and biased gradient updates.<n>We propose an adaptive differentially private federated learning framework that explicitly targets model efficiency under heterogeneous and privacy-constrained settings.
arXiv Detail & Related papers (2026-02-06T16:27:33Z)
Adaptive Token-Weighted Differential Privacy for LLMs: Not All Tokens Require Equal Protection [12.047350336564193]
We operationalize this insight through Adaptive Token-Weighted Differential Privacy (ATDP)<n>ATDP adaptively assigns different gradient weights to sensitive and non-sensitive tokens.<n>It can be seamlessly integrated into any existing DP-based fine-tuning pipeline.
arXiv Detail & Related papers (2025-09-27T10:51:07Z)
Natural Spectral Fusion: p-Exponent Cyclic Scheduling and Early Decision-Boundary Alignment in First-Order Optimization [11.323131201168572]
We propose Natural Spectral Fusion (NSF): reframing training as controllable spectral coverage and information fusion.<n>NSF has two core principles: treating the balances as a spectral controller that dynamically low- and high-frequency information.<n>We show that cyclic scheduling consistently reduces test error and demonstrates distinct convergence behavior.
arXiv Detail & Related papers (2025-09-05T00:00:00Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization.<n>This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD)<n>We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z)
Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation. We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC. We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z)
Online Sensitivity Optimization in Differentially Private Learning [8.12606646175019]
We present a novel approach to dynamically optimize the clipping threshold. We treat this threshold as an additional learnable parameter, establishing a clean relationship between the threshold and the cost function. Our method is thoroughly assessed against alternative fixed and adaptive strategies across diverse datasets, tasks, model dimensions, and privacy levels.
arXiv Detail & Related papers (2023-10-02T00:30:49Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
Differentially Private Learning with Per-Sample Adaptive Clipping [8.401653565794353]
We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function. We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
arXiv Detail & Related papers (2022-12-01T07:26:49Z)
Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes [71.23286211775084]
We propose an adaptive low-pass filter based on Gaussian process regression. We show that the estimation error of the proposed method is uniformly bounded.
arXiv Detail & Related papers (2021-11-05T17:06:59Z)
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering [53.523517926927894]
We explore the use of exact per-sample Hessian-vector products and gradients to construct self-tuning quadratics. We prove that our model-based procedure converges in noisy gradient setting. This is an interesting step for constructing self-tuning quadratics.
arXiv Detail & Related papers (2020-11-09T22:07:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.