Bayesian Federated Learning with Hamiltonian Monte Carlo: Algorithm and Theory
- URL: http://arxiv.org/abs/2407.06935v1
- Date: Tue, 9 Jul 2024 15:10:59 GMT
- Title: Bayesian Federated Learning with Hamiltonian Monte Carlo: Algorithm and Theory
- Authors: Jiajun Liang, Qian Zhang, Wei Deng, Qifan Song, Guang Lin,
- Abstract summary: This work introduces a novel and efficient Bayesian federated learning algorithm, namely, the Federated Averaging Hamiltonian Monte Carlo (FA-HMC)
We establish rigorous convergence guarantees of FA-HMC on non-iid distributed data sets.
We show that FA-HMC outperforms the existing Federated Averaging-Langevin Monte Carlo (FA-LD) algorithm.
- Score: 20.758987791387515
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work introduces a novel and efficient Bayesian federated learning algorithm, namely, the Federated Averaging stochastic Hamiltonian Monte Carlo (FA-HMC), for parameter estimation and uncertainty quantification. We establish rigorous convergence guarantees of FA-HMC on non-iid distributed data sets, under the strong convexity and Hessian smoothness assumptions. Our analysis investigates the effects of parameter space dimension, noise on gradients and momentum, and the frequency of communication (between the central node and local nodes) on the convergence and communication costs of FA-HMC. Beyond that, we establish the tightness of our analysis by showing that the convergence rate cannot be improved even for continuous FA-HMC process. Moreover, extensive empirical studies demonstrate that FA-HMC outperforms the existing Federated Averaging-Langevin Monte Carlo (FA-LD) algorithm.
Related papers
- Detecting Causality in the Frequency Domain with Cross-Mapping Coherence [0.4218593777811082]
This study introduces the Cross-Mapping Coherence (CMC) method, designed to reveal causal connections in the frequency domain between time series.
We tested the method using simulations of logistic maps, Lorenz systems, Kuramoto oscillators, and the Wilson-Cowan model of the visual cortex.
In conclusion, the capability to determine directed causal influences across different frequency bands allows CMC to provide valuable insights into the dynamics of complex, nonlinear systems.
arXiv Detail & Related papers (2024-07-30T09:43:35Z) - On Convergence of the Alternating Directions SGHMC Algorithm [2.609441136025819]
We study convergence rates of Hamiltonian Monte Carlo (HMC) algorithms with leapfrog integration under mild conditions on gradient oracle for the target distribution (SGHMC)
Our method extends standard HMC by allowing the use of general auxiliary distributions, which is achieved by a novel procedure of Alternating Directions.
arXiv Detail & Related papers (2024-05-21T18:22:44Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory [79.50644650795012]
We propose a deep learning approach to solve Kohn-Sham Density Functional Theory (KS-DFT)
We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity.
In addition, we show that our approach enables us to explore more complex neural-based wave functions.
arXiv Detail & Related papers (2023-03-01T10:38:10Z) - Utilising the CLT Structure in Stochastic Gradient based Sampling :
Improved Analysis and Faster Algorithms [14.174806471635403]
We consider approximations of sampling algorithms, such as Gradient Langevin Dynamics (SGLD) and the Random Batch Method (RBM) for Interacting Particle Dynamcs (IPD)
We observe that the noise introduced by the approximation is nearly Gaussian due to the Central Limit Theorem (CLT) while the driving Brownian motion is exactly Gaussian.
We harness this structure to absorb the approximation error inside the diffusion process, and obtain improved convergence guarantees for these algorithms.
arXiv Detail & Related papers (2022-06-08T10:17:40Z) - Hamiltonian Monte Carlo Particle Swarm Optimizer [0.0]
Hamiltonian Particle Swarm (HMCPSO) is an optimization algorithm that reaps the benefits of both Exponentially Average sampling and HMC position and velocity.
arXiv Detail & Related papers (2022-05-08T04:47:34Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Hamiltonian Monte Carlo with Asymmetrical Momentum Distributions [3.562271099341746]
We present a novel convergence analysis for the Hamiltonian Monte Carlo (HMC) algorithm.
We show that plain HMC with asymmetrical momentum distributions breaks a key self-adjointness requirement.
We propose a modified version that we call the Alternating Direction HMC (AD-HMC)
arXiv Detail & Related papers (2021-10-21T18:36:19Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.