Convergence and Accuracy Trade-Offs in Federated Learning and
Meta-Learning
- URL: http://arxiv.org/abs/2103.05032v1
- Date: Mon, 8 Mar 2021 19:40:32 GMT
- Title: Convergence and Accuracy Trade-Offs in Federated Learning and
Meta-Learning
- Authors: Zachary Charles, Jakub Kone\v{c}n\'y
- Abstract summary: We study a family of algorithms, which we refer to as local update methods.
We prove that for quadratic models, local update methods are equivalent to first-order optimization on a surrogate loss.
We derive novel convergence rates showcasing these trade-offs and highlight their importance in communication-limited settings.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a family of algorithms, which we refer to as local update methods,
generalizing many federated and meta-learning algorithms. We prove that for
quadratic models, local update methods are equivalent to first-order
optimization on a surrogate loss we exactly characterize. Moreover, fundamental
algorithmic choices (such as learning rates) explicitly govern a trade-off
between the condition number of the surrogate loss and its alignment with the
true loss. We derive novel convergence rates showcasing these trade-offs and
highlight their importance in communication-limited settings. Using these
insights, we are able to compare local update methods based on their
convergence/accuracy trade-off, not just their convergence to critical points
of the empirical loss. Our results shed new light on a broad range of
phenomena, including the efficacy of server momentum in federated learning and
the impact of proximal client updates.
Related papers
- Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Information [6.767885381740953]
Federated learning has emerged as a distributed optimization paradigm.
We propose a novel modified framework wherein each client locally performs a perturbed gradient step.
We show that our algorithm speeds convergence up to a margin of 30 global rounds compared with FedAvg.
arXiv Detail & Related papers (2024-10-07T23:14:05Z) - Flashback: Understanding and Mitigating Forgetting in Federated Learning [7.248285042377168]
In Federated Learning (FL), forgetting, or the loss of knowledge across rounds, hampers algorithm convergence.
We introduce a metric to measure forgetting granularly, ensuring distinct recognition amid new knowledge acquisition.
We propose Flashback, an FL algorithm with a dynamic distillation approach that is used to regularize the local models, and effectively aggregate their knowledge.
arXiv Detail & Related papers (2024-02-08T10:52:37Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - FedET: A Communication-Efficient Federated Class-Incremental Learning
Framework Based on Enhanced Transformer [42.19443600254834]
We propose a novel framework, Federated Enhanced Transformer (FedET), which simultaneously achieves high accuracy and low communication cost.
FedET uses Enhancer, a tiny module, to absorb and communicate new knowledge.
We show that FedET's average accuracy on representative benchmark datasets is 14.1% higher than the state-of-the-art method.
arXiv Detail & Related papers (2023-06-27T10:00:06Z) - Global Update Guided Federated Learning [11.731231528534035]
Federated learning protects data privacy and security by exchanging models instead of data.
We propose global-update-guided federated learning (FedGG), which introduces a model-cosine loss into local objective functions.
Numerical simulations show that FedGG has a significant improvement on model convergence accuracies and speeds.
arXiv Detail & Related papers (2022-04-08T08:36:26Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - On the Outsized Importance of Learning Rates in Local Update Methods [2.094022863940315]
We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms.
We prove that for quadratic objectives, local update methods perform gradient descent on a surrogate loss function which we exactly characterize.
We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions.
arXiv Detail & Related papers (2020-07-02T04:45:55Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.