Fed-LAMB: Layerwise and Dimensionwise Locally Adaptive Optimization
Algorithm
- URL: http://arxiv.org/abs/2110.00532v1
- Date: Fri, 1 Oct 2021 16:54:31 GMT
- Title: Fed-LAMB: Layerwise and Dimensionwise Locally Adaptive Optimization
Algorithm
- Authors: Belhal Karimi, Xiaoyun Li, Ping Li
- Abstract summary: In the emerging paradigm of federated learning (FL), large amount of clients, such as mobile devices, are used to train on their respective data.
Due to the low bandwidth, decentralized optimization methods need to shift the computation burden from those clients to those servers.
We present Fed-LAMB, a novel learning method based on a layerwise, deep neural networks.
- Score: 24.42828071396353
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the emerging paradigm of federated learning (FL), large amount of clients,
such as mobile devices, are used to train possibly high-dimensional models on
their respective data. Due to the low bandwidth of mobile devices,
decentralized optimization methods need to shift the computation burden from
those clients to the computation server while preserving privacy and reasonable
communication cost. In this paper, we focus on the training of deep, as in
multilayered, neural networks, under the FL settings. We present Fed-LAMB, a
novel federated learning method based on a layerwise and dimensionwise updates
of the local models, alleviating the nonconvexity and the multilayered nature
of the optimization task at hand. We provide a thorough finite-time convergence
analysis for Fed-LAMB characterizing how fast its gradient decreases. We
provide experimental results under iid and non-iid settings to corroborate not
only our theory, but also exhibit the faster convergence of our method,
compared to the state-of-the-art.
Related papers
- Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Information [6.767885381740953]
Federated learning has emerged as a distributed optimization paradigm.
We propose a novel modified framework wherein each client locally performs a perturbed gradient step.
We show that our algorithm speeds convergence up to a margin of 30 global rounds compared with FedAvg.
arXiv Detail & Related papers (2024-10-07T23:14:05Z) - Heterogeneity-Aware Resource Allocation and Topology Design for Hierarchical Federated Edge Learning [9.900317349372383]
Federated Learning (FL) provides a privacy-preserving framework for training machine learning models on mobile edge devices.
Traditional FL algorithms, e.g., FedAvg, impose a heavy communication workload on these devices.
We propose a two-tier HFEL system, where edge devices are connected to edge servers and edge servers are interconnected through peer-to-peer (P2P) edge backhauls.
Our goal is to enhance the training efficiency of the HFEL system through strategic resource allocation and topology design.
arXiv Detail & Related papers (2024-09-29T01:48:04Z) - Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse [56.384390765357004]
We propose an integrated federated split learning and hyperdimensional computing framework for emerging foundation models.
This novel approach reduces communication costs, computation load, and privacy risks, making it suitable for resource-constrained edge devices in the Metaverse.
arXiv Detail & Related papers (2024-08-26T17:03:14Z) - Gradient-Congruity Guided Federated Sparse Training [31.793271982853188]
Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy.
FL also faces challenges such as high computational and communication costs regarding resource-constrained devices.
We propose the Gradient-Congruity Guided Federated Sparse Training (FedSGC), a novel method that integrates dynamic sparse training and gradient congruity inspection into federated learning framework.
arXiv Detail & Related papers (2024-05-02T11:29:48Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - Gradient Sparsification for Efficient Wireless Federated Learning with
Differential Privacy [25.763777765222358]
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other.
As the model size grows, the training latency due to limited transmission bandwidth and private information degrades while using differential privacy (DP) protection.
We propose sparsification empowered FL framework wireless channels, in over to improve training efficiency without sacrificing convergence performance.
arXiv Detail & Related papers (2023-04-09T05:21:15Z) - Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated
Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks.
We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z) - Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.
In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training.
Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.