Related papers: Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning

Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning

URL: http://arxiv.org/abs/2103.05032v1
Date: Mon, 8 Mar 2021 19:40:32 GMT
Title: Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning
Authors: Zachary Charles, Jakub Kone\v{c}n\'y
Abstract summary: We study a family of algorithms, which we refer to as local update methods. We prove that for quadratic models, local update methods are equivalent to first-order optimization on a surrogate loss. We derive novel convergence rates showcasing these trade-offs and highlight their importance in communication-limited settings.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study a family of algorithms, which we refer to as local update methods, generalizing many federated and meta-learning algorithms. We prove that for quadratic models, local update methods are equivalent to first-order optimization on a surrogate loss we exactly characterize. Moreover, fundamental algorithmic choices (such as learning rates) explicitly govern a trade-off between the condition number of the surrogate loss and its alignment with the true loss. We derive novel convergence rates showcasing these trade-offs and highlight their importance in communication-limited settings. Using these insights, we are able to compare local update methods based on their convergence/accuracy trade-off, not just their convergence to critical points of the empirical loss. Our results shed new light on a broad range of phenomena, including the efficacy of server momentum in federated learning and the impact of proximal client updates.

Related papers

Communication Efficient Federated Learning with Linear Convergence on Heterogeneous Data [4.8305656901807055]
We propose a federated learning algorithm called FedCET to ensure accurate convergence under heterogeneous data distributions. We prove that under appropriate learning rates, FedCET can ensure linear convergence to the exact solution.
arXiv Detail & Related papers (2025-03-20T02:43:02Z)
Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Information [6.767885381740953]
Federated learning has emerged as a distributed optimization paradigm. We propose a novel modified framework wherein each client locally performs a perturbed gradient step. We show that our algorithm speeds convergence up to a margin of 30 global rounds compared with FedAvg.
arXiv Detail & Related papers (2024-10-07T23:14:05Z)
Flashback: Understanding and Mitigating Forgetting in Federated Learning [7.248285042377168]
In Federated Learning (FL), forgetting, or the loss of knowledge across rounds, hampers algorithm convergence. We introduce a metric to measure forgetting granularly, ensuring distinct recognition amid new knowledge acquisition. We propose Flashback, an FL algorithm with a dynamic distillation approach that is used to regularize the local models, and effectively aggregate their knowledge.
arXiv Detail & Related papers (2024-02-08T10:52:37Z)
Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp) We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings. For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error. In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
FedET: A Communication-Efficient Federated Class-Incremental Learning Framework Based on Enhanced Transformer [42.19443600254834]
We propose a novel framework, Federated Enhanced Transformer (FedET), which simultaneously achieves high accuracy and low communication cost. FedET uses Enhancer, a tiny module, to absorb and communicate new knowledge. We show that FedET's average accuracy on representative benchmark datasets is 14.1% higher than the state-of-the-art method.
arXiv Detail & Related papers (2023-06-27T10:00:06Z)
Global Update Guided Federated Learning [11.731231528534035]
Federated learning protects data privacy and security by exchanging models instead of data. We propose global-update-guided federated learning (FedGG), which introduces a model-cosine loss into local objective functions. Numerical simulations show that FedGG has a significant improvement on model convergence accuracies and speeds.
arXiv Detail & Related papers (2022-04-08T08:36:26Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Towards Accurate Knowledge Transfer via Target-awareness Representation Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED) TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model. Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
On the Outsized Importance of Learning Rates in Local Update Methods [2.094022863940315]
We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms. We prove that for quadratic objectives, local update methods perform gradient descent on a surrogate loss function which we exactly characterize. We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions.
arXiv Detail & Related papers (2020-07-02T04:45:55Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.