A Decentralized Approach to Bayesian Learning
- URL: http://arxiv.org/abs/2007.06799v4
- Date: Sat, 9 Jan 2021 17:01:44 GMT
- Title: A Decentralized Approach to Bayesian Learning
- Authors: Anjaly Parayil, He Bai, Jemin George, and Prudhvi Gurram
- Abstract summary: Motivated by decentralized approaches to machine learning, we propose a collaborative learning taking the form of decentralized Langevin dynamics.
Our analysis show that the initial KL-divergence between the Markov Chain and the target posterior distribution is exponentially decreasing.
The performance of individual agents with locally available data is on par with the centralized setting with considerable improvement in the rate.
- Score: 26.74338464389837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by decentralized approaches to machine learning, we propose a
collaborative Bayesian learning algorithm taking the form of decentralized
Langevin dynamics in a non-convex setting. Our analysis show that the initial
KL-divergence between the Markov Chain and the target posterior distribution is
exponentially decreasing while the error contributions to the overall
KL-divergence from the additive noise is decreasing in polynomial time. We
further show that the polynomial-term experiences speed-up with number of
agents and provide sufficient conditions on the time-varying step-sizes to
guarantee convergence to the desired distribution. The performance of the
proposed algorithm is evaluated on a wide variety of machine learning tasks.
The empirical results show that the performance of individual agents with
locally available data is on par with the centralized setting with considerable
improvement in the convergence rate.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - ScoreFusion: fusing score-based generative models via Kullback-Leibler barycenters [8.08976346461518]
We introduce ScoreFusion, a theoretically grounded method for fusing multiple pre-trained diffusion models.
Our starting point considers the family of KL barycenters of the auxiliary populations, which is proven to be an optimal parametric class in the KL sense.
By recasting the learning problem as score matching in denoising diffusion, we obtain a tractable way of computing the optimal KL barycenter weights.
arXiv Detail & Related papers (2024-06-28T03:02:25Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Exact Subspace Diffusion for Decentralized Multitask Learning [17.592204922442832]
Distributed strategies for multitask learning induce relationships between agents in a more nuanced manner, and encourage collaboration without enforcing consensus.
We develop a generalization of the exact diffusion algorithm for subspace constrained multitask learning over networks, and derive an accurate expression for its mean-squared deviation.
We verify numerically the accuracy of the predicted performance expressions, as well as the improved performance of the proposed approach over alternatives based on approximate projections.
arXiv Detail & Related papers (2023-04-14T19:42:19Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Finite-Time Convergence Rates of Decentralized Stochastic Approximation
with Applications in Multi-Agent and Multi-Task Learning [16.09467599829253]
We study a data-driven approach for finding the root of an operator under noisy measurements.
A network of agents, each with its own operator and data observations, cooperatively find the fixed point of the aggregate operator over a decentralized communication graph.
Our main contribution is to provide a finite-time analysis of this decentralized approximation method when the data observed at each agent are sampled from a Markov process.
arXiv Detail & Related papers (2020-10-28T17:01:54Z) - A Distributional Analysis of Sampling-Based Reinforcement Learning
Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.
We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.