Related papers: FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation

FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation

URL: http://arxiv.org/abs/2309.17174v1
Date: Fri, 29 Sep 2023 12:13:41 GMT
Title: FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation
Authors: Alessio Maritan, Subhrakanti Dey, Luca Schenato
Abstract summary: We design the first federated zeroth-order algorithm to estimate the curvature of the global objective. We take an incremental Hessian estimator whose error norm converges linearly, and we adapt it to the federated zeroth-order setting. We provide a theoretical analysis of our algorithm, named FedZeN, proving local quadratic convergence with high probability and global linear convergence up to zeroth-order precision.
Score: 1.45179582111722
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning is a distributed learning framework that allows a set of clients to collaboratively train a model under the orchestration of a central server, without sharing raw data samples. Although in many practical scenarios the derivatives of the objective function are not available, only few works have considered the federated zeroth-order setting, in which functions can only be accessed through a budgeted number of point evaluations. In this work we focus on convex optimization and design the first federated zeroth-order algorithm to estimate the curvature of the global objective, with the purpose of achieving superlinear convergence. We take an incremental Hessian estimator whose error norm converges linearly, and we adapt it to the federated zeroth-order setting, sampling the random search directions from the Stiefel manifold for improved performance. In particular, both the gradient and Hessian estimators are built at the central server in a communication-efficient and privacy-preserving way by leveraging synchronized pseudo-random number generators. We provide a theoretical analysis of our algorithm, named FedZeN, proving local quadratic convergence with high probability and global linear convergence up to zeroth-order precision. Numerical simulations confirm the superlinear convergence rate and show that our algorithm outperforms the federated zeroth-order methods available in the literature.

Related papers

Federated Learning on Riemannian Manifolds: A Gradient-Free Projection-Based Approach [5.33725915743382]
Federated learning (FL) has emerged as a powerful paradigm for collaborative model training across distributed clients.<n>Existing FL algorithms predominantly focus on unconstrained optimization problems with exact gradient information.
arXiv Detail & Related papers (2025-07-30T17:24:27Z)
Faster Diffusion Models via Higher-Order Approximation [28.824924809206255]
We propose a principled, training-free sampling algorithm that requires only the order of $$ d1+2/K varepsilon-1/K $$ score function evaluations.<n>Our theory is robust vis-a-vis inexact score estimation, degrading gracefully as the score estimation error increases.<n>More broadly, our work develops a theoretical framework towards understanding the efficacy of high-order methods for accelerated sampling.
arXiv Detail & Related papers (2025-06-30T16:49:03Z)
From Interpretation to Correction: A Decentralized Optimization Framework for Exact Convergence in Federated Learning [9.870718388000645]
This work introduces a novel decentralized framework to interpret which corrects the biases introduced by arbitrary client participation and data heterogeneity. We are able to provide a concise analysis to quantify the impact of arbitrary participation and data heterogeneity on FedAvg's convergence point. This insight motivates the development of Federated Optimization with Exact Convergence via Push-pull Strategy (FOCUS)
arXiv Detail & Related papers (2025-03-25T23:54:23Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function [14.986031916712108]
We introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique. We prove that this new technique converges with a numerical function at a noisy setting.
arXiv Detail & Related papers (2024-10-08T11:45:45Z)
An Asymptotically Optimal Coordinate Descent Algorithm for Learning Bayesian Networks from Gaussian Models [6.54203362045253]
We study the problem of learning networks from continuous observational data, generated according to a linear Gaussian structural equation model. We propose a new coordinate descent algorithm that converges to the optimal objective value of the $ell$penalized maximum likelihood.
arXiv Detail & Related papers (2024-08-21T20:18:03Z)
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization [8.670873561640903]
We investigate the finite-time analysis of finding ($delta,,ilon$)-stationary points for nonsmooth nonsmooth objectives in decentralized settings. $O is the first finite-time guarantee for decentralized nonsmooth optimization.
arXiv Detail & Related papers (2024-06-03T16:09:34Z)
Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift") We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent [7.6860514640178]
We propose a novel zeroth-order optimization algorithm for distributed reinforcement learning. It allows each agent to estimate its local gradient by cost evaluation independently, without use of any consensus protocol.
arXiv Detail & Related papers (2021-07-26T18:11:07Z)
Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm. Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization [93.78811018928583]
This paper provides a framework to analyze the convergence of federated heterogeneous optimization algorithms. We propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.
arXiv Detail & Related papers (2020-07-15T05:01:23Z)
Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function. We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.