DP-FEDSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix
- URL: http://arxiv.org/abs/2601.09166v1
- Date: Wed, 14 Jan 2026 05:11:28 GMT
- Title: DP-FEDSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix
- Authors: Sidhant R. Nair, Tanmay Sen, Mrinmay Sen,
- Abstract summary: Differentially private federated learning (DP-FL) suffers from slow convergence under tight privacy budgets due to the overwhelming noise introduced to preserve privacy.<n>We propose DP-FedSOFIM, a server-side second-order optimization framework that leverages the Fisher Information Matrix (FIM) as a natural preconditioner while requiring only O(d) memory per client.<n>Our analysis proves that the server-side preconditioning preserves (epsilon, delta)-differential privacy through the post-processing theorem.
- Score: 0.0611737116137921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentially private federated learning (DP-FL) suffers from slow convergence under tight privacy budgets due to the overwhelming noise introduced to preserve privacy. While adaptive optimizers can accelerate convergence, existing second-order methods such as DP-FedNew require O(d^2) memory at each client to maintain local feature covariance matrices, making them impractical for high-dimensional models. We propose DP-FedSOFIM, a server-side second-order optimization framework that leverages the Fisher Information Matrix (FIM) as a natural gradient preconditioner while requiring only O(d) memory per client. By employing the Sherman-Morrison formula for efficient matrix inversion, DP-FedSOFIM achieves O(d) computational complexity per round while maintaining the convergence benefits of second-order methods. Our analysis proves that the server-side preconditioning preserves (epsilon, delta)-differential privacy through the post-processing theorem. Empirical evaluation on CIFAR-10 demonstrates that DP-FedSOFIM achieves superior test accuracy compared to first-order baselines across multiple privacy regimes.
Related papers
- Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective [42.70658101277954]
Differential Privacy (DP) is becoming central to large-scale training as privacy regulations tighten.<n>We revisit how noise interacts with adaptivity in optimization through the lens of differential equations.<n>We show that DP-SGD converges at a Privacy-Utility Trade-Off of $mathcalO (1/varepsilon2)$ with speed independent of $varepsilon$, while DP-SignSGD converges at a speed linear in $varepsilon$ with speed independent of $varepsilon$.
arXiv Detail & Related papers (2026-03-03T18:17:57Z) - Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization [54.723834588133165]
We study online linear optimization with matrix variables by the operatorAML, a setting where the geometry renders designing datadependent and efficient adaptive algorithms challenging.<n>We instantiate this framework with two efficient methods that avoid projections.<n>We show both methods admit closed-form updates match one-sided Shampoo's regret up to a constant factor, while significantly reducing computational cost.
arXiv Detail & Related papers (2026-02-09T03:09:47Z) - DP-MicroAdam: Private and Frugal Algorithm for Training and Fine-tuning [7.445350484328613]
Adaptives are the de facto standard in non-private training as they often enable faster convergence and improved performance.<n>In contrast, differentially private training is still predominantly performed with DP-SGD, typically.
arXiv Detail & Related papers (2025-11-25T17:17:48Z) - FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA [68.44043212834204]
Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in learning (FL)<n>Low-Rank Adaptation (LoRA) is widely used for efficient fine-tuning of language models in learning (FL)
arXiv Detail & Related papers (2025-05-19T07:32:56Z) - Double Momentum and Error Feedback for Clipping with Fast Rates and Differential Privacy [11.356444450240799]
existing algorithms do not achieve Strong Differential Privacy (DP) and Optimization guarantees at once.<n>We propose a new method called Clip21-SGD2M based on a novel combination of clipping, heavy-ball momentum, and Error Feedback.
arXiv Detail & Related papers (2025-02-17T11:16:21Z) - Noise is All You Need: Private Second-Order Convergence of Noisy SGD [15.31952197599396]
We show that noise necessary for privacy already implies second-order convergence under the standard smoothness assumptions.
We get second-order convergence essentially for free: DP-SGD, the workhorse of modern private optimization, under minimal assumptions can be used to find a second-order stationary point.
arXiv Detail & Related papers (2024-10-09T13:43:17Z) - DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction [57.83978915843095]
This paper introduces DiSK, a novel framework designed to significantly enhance the performance of differentially private gradients.<n>To ensure practicality for large-scale training, we simplify the Kalman filtering process, minimizing its memory and computational demands.
arXiv Detail & Related papers (2024-10-04T19:30:39Z) - AdaFisher: Adaptive Second Order Optimization via Fisher Information [22.851200800265914]
First-order optimization methods are currently the mainstream in training deep neural networks (DNNs).s like Adam incorporate limited curvature information by employing the matrix preconditioning of the gradient during the training.<n>Despite their widespread, second-order optimization algorithms exhibit superior convergence properties compared to their first-order counterparts e.g. Adam and SGD.<n>We present emphAdaFisher--an adaptive second-order that leverages a emphdiagonal block-Kronecker approximation of the Fisher information matrix for adaptive gradient preconditioning.
arXiv Detail & Related papers (2024-05-26T01:25:02Z) - Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy [47.997934291881414]
Existing mean estimation schemes are usually optimized for $L_infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L$ geometry.
We introduce a novel privacy accounting method for the sparsified Gaussian mechanism that incorporates the randomness inherent in sparsification into the DP.
Unlike previous approaches, our accounting algorithm directly operates in $L$ geometry, yielding MSEs that fast converge to those of the Gaussian mechanism.
arXiv Detail & Related papers (2024-05-02T03:48:47Z) - Private Fine-tuning of Large Language Models with Zeroth-order Optimization [51.19403058739522]
Differentially private gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner.<n>We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.