GP-FL: Model-Based Hessian Estimation for Second-Order Over-the-Air Federated Learning
- URL: http://arxiv.org/abs/2412.03867v1
- Date: Thu, 05 Dec 2024 04:27:41 GMT
- Title: GP-FL: Model-Based Hessian Estimation for Second-Order Over-the-Air Federated Learning
- Authors: Shayan Mohajer Hamidi, Ali Bereyhi, Saba Asaad, H. Vincent Poor,
- Abstract summary: Second-order methods are widely adopted to improve the convergence rate of learning algorithms.
This paper introduces a novel second-order FL framework tailored for wireless channels.
- Score: 52.295563400314094
- License:
- Abstract: Second-order methods are widely adopted to improve the convergence rate of learning algorithms. In federated learning (FL), these methods require the clients to share their local Hessian matrices with the parameter server (PS), which comes at a prohibitive communication cost. A classical solution to this issue is to approximate the global Hessian matrix from the first-order information. Unlike in idealized networks, this solution does not perform effectively in over-the-air FL settings, where the PS receives noisy versions of the local gradients. This paper introduces a novel second-order FL framework tailored for wireless channels. The pivotal innovation lies in the PS's capability to directly estimate the global Hessian matrix from the received noisy local gradients via a non-parametric method: the PS models the unknown Hessian matrix as a Gaussian process, and then uses the temporal relation between the gradients and Hessian along with the channel model to find a stochastic estimator for the global Hessian matrix. We refer to this method as Gaussian process-based Hessian modeling for wireless FL (GP-FL) and show that it exhibits a linear-quadratic convergence rate. Numerical experiments on various datasets demonstrate that GP-FL outperforms all classical baseline first and second order FL approaches.
Related papers
- A Historical Trajectory Assisted Optimization Method for Zeroth-Order Federated Learning [24.111048817721592]
Federated learning heavily relies on distributed gradient descent techniques.
In the situation where gradient information is not available, gradients need to be estimated from zeroth-order information.
We propose a non-isotropic sampling method to improve the gradient estimation procedure.
arXiv Detail & Related papers (2024-09-24T10:36:40Z) - Fed-Sophia: A Communication-Efficient Second-Order Federated Learning Algorithm [28.505671833986067]
Federated learning is a machine learning approach where multiple devices collaboratively learn with the help of a parameter server by sharing only their local updates.
While gradient-based optimization techniques are widely adopted in this domain, the curvature information that second-order methods exhibit is crucial to guide and speed up the convergence.
This paper introduces a scalable second-order method, allowing the adoption of curvature information in federated large models.
arXiv Detail & Related papers (2024-06-10T09:57:30Z) - Calibrated One Round Federated Learning with Bayesian Inference in the
Predictive Space [27.259110269667826]
Federated Learning (FL) involves training a model over a dataset distributed among clients.
Small and noisy datasets are common, highlighting the need for well-calibrated models.
We propose $beta$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors.
arXiv Detail & Related papers (2023-12-15T14:17:16Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Predicting Flat-Fading Channels via Meta-Learned Closed-Form Linear
Filters and Equilibrium Propagation [38.42468500092177]
Predicting fading channels is a classical problem with a vast array of applications.
In practice, the Doppler spectrum is unknown, and the predictor has only access to a limited time series of estimated channels.
This paper proposes to leverage meta-learning in order to mitigate the requirements in terms of training data for channel fading prediction.
arXiv Detail & Related papers (2021-10-01T14:00:23Z) - Neural Calibration for Scalable Beamforming in FDD Massive MIMO with
Implicit Channel Estimation [10.775558382613077]
Channel estimation and beamforming play critical roles in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems.
We propose a deep learning-based approach that directly optimize the beamformers at the base station according to the received uplink pilots.
A neural calibration method is proposed to improve the scalability of the end-to-end design.
arXiv Detail & Related papers (2021-08-03T14:26:14Z) - STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal
Sample and Communication Complexities for Federated Learning [58.6792963686231]
Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data.
It is not clear how to choose the WNs' minimum update directions, the first minibatch sizes, and the local update frequency.
We show that there is a trade-off curve between local update frequencies and local mini sizes, on which the above complexities can be maintained.
arXiv Detail & Related papers (2021-06-19T06:13:45Z) - Hybrid Federated Learning: Algorithms and Implementation [61.0640216394349]
Federated learning (FL) is a recently proposed distributed machine learning paradigm dealing with distributed and private data sets.
We propose a new model-matching-based problem formulation for hybrid FL.
We then propose an efficient algorithm that can collaboratively train the global and local models to deal with full and partial featured data.
arXiv Detail & Related papers (2020-12-22T23:56:03Z) - Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior.
Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm.
Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.