On Convergence of Federated Averaging Langevin Dynamics
- URL: http://arxiv.org/abs/2112.05120v4
- Date: Thu, 5 Oct 2023 15:11:54 GMT
- Title: On Convergence of Federated Averaging Langevin Dynamics
- Authors: Wei Deng, Qian Zhang, Yi-An Ma, Zhao Song, Guang Lin
- Abstract summary: We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty quantification and mean predictions with distributed clients.
We develop theoretical guarantees for FA-LD for strongly log-con distributions with non-icaved data.
We show convergence results based on different averaging schemes where only partial device updates are available.
- Score: 22.013125418713763
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty
quantification and mean predictions with distributed clients. In particular, we
generalize beyond normal posterior distributions and consider a general class
of models. We develop theoretical guarantees for FA-LD for strongly log-concave
distributions with non-i.i.d data and study how the injected noise and the
stochastic-gradient noise, the heterogeneity of data, and the varying learning
rates affect the convergence. Such an analysis sheds light on the optimal
choice of local updates to minimize communication costs. Important to our
approach is that the communication efficiency does not deteriorate with the
injected noise in the Langevin algorithms. In addition, we examine in our FA-LD
algorithm both independent and correlated noise used over different clients. We
observe there is a trade-off between the pairs among communication, accuracy,
and data privacy. As local devices may become inactive in federated networks,
we also show convergence results based on different averaging schemes where
only partial device updates are available. In such a case, we discover an
additional bias that does not decay to zero.
Related papers
- Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - STRONG: Synchronous and asynchronous RObust Network localization, under
Non-Gaussian noise [0.0]
Real-world network applications must cope with failing nodes, malicious attacks and data classified as outliers.
Our work addresses these concerns in the scope of the sensor network localization algorithms.
A major highlight of our contribution lies on the fact that we pay no price for provable distributed neither in accuracy, nor in communication cost or speed.
arXiv Detail & Related papers (2021-10-01T18:01:28Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Wireless Federated Learning with Limited Communication and Differential
Privacy [21.328507360172203]
This paper investigates the role of dimensionality reduction in efficient communication and differential privacy (DP) of the local datasets at the remote users for over-the-air computation (AirComp)-based federated learning (FL) model.
arXiv Detail & Related papers (2021-06-01T15:23:12Z) - Federated Learning with Compression: Unified Analysis and Sharp
Guarantees [39.092596142018195]
Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices.
Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication.
We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
arXiv Detail & Related papers (2020-07-02T14:44:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.