Towards Federated Learning Under Resource Constraints via Layer-wise
Training and Depth Dropout
- URL: http://arxiv.org/abs/2309.05213v1
- Date: Mon, 11 Sep 2023 03:17:45 GMT
- Title: Towards Federated Learning Under Resource Constraints via Layer-wise
Training and Depth Dropout
- Authors: Pengfei Guo, Warren Richard Morningstar, Raviteja Vemulapalli, Karan
Singhal, Vishal M. Patel, Philip Andrew Mansfield
- Abstract summary: Federated learning can be difficult to scale to large models when clients have limited resources.
We introduce Federated Layer-wise Learning to simultaneously reduce per-client memory, computation, and communication costs.
We also introduce Federated Depth Dropout, a complementary technique that randomly drops frozen layers during training, to further reduce resource usage.
- Score: 33.308067180286045
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large machine learning models trained on diverse data have recently seen
unprecedented success. Federated learning enables training on private data that
may otherwise be inaccessible, such as domain-specific datasets decentralized
across many clients. However, federated learning can be difficult to scale to
large models when clients have limited resources. This challenge often results
in a trade-off between model size and access to diverse data. To mitigate this
issue and facilitate training of large models on edge devices, we introduce a
simple yet effective strategy, Federated Layer-wise Learning, to simultaneously
reduce per-client memory, computation, and communication costs. Clients train
just a single layer each round, reducing resource costs considerably with
minimal performance degradation. We also introduce Federated Depth Dropout, a
complementary technique that randomly drops frozen layers during training, to
further reduce resource usage. Coupling these two techniques enables us to
effectively train significantly larger models on edge devices. Specifically, we
reduce training memory usage by 5x or more in federated self-supervised
representation learning and demonstrate that performance in downstream tasks is
comparable to conventional federated self-supervised learning.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Small Dataset, Big Gains: Enhancing Reinforcement Learning by Offline
Pre-Training with Model Based Augmentation [59.899714450049494]
offline pre-training can produce sub-optimal policies and lead to degraded online reinforcement learning performance.
We propose a model-based data augmentation strategy to maximize the benefits of offline reinforcement learning pre-training and reduce the scale of data needed to be effective.
arXiv Detail & Related papers (2023-12-15T14:49:41Z) - Toward efficient resource utilization at edge nodes in federated learning [0.6990493129893112]
Federated learning enables edge nodes to collaboratively contribute to constructing a global model without sharing their data.
computational resource constraints and network communication can become a severe bottleneck for larger model sizes typical for deep learning applications.
We propose and evaluate a FL strategy inspired by transfer learning in order to reduce resource utilization on devices.
arXiv Detail & Related papers (2023-09-19T07:04:50Z) - Federated Pruning: Improving Neural Network Efficiency with Federated
Learning [24.36174705715827]
We propose Federated Pruning to train a reduced model under the federated setting.
We explore different pruning schemes and provide empirical evidence of the effectiveness of our methods.
arXiv Detail & Related papers (2022-09-14T00:48:37Z) - No One Left Behind: Inclusive Federated Learning over Heterogeneous
Devices [79.16481453598266]
We propose InclusiveFL, a client-inclusive federated learning method to handle this problem.
The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities.
We also propose an effective method to share the knowledge among multiple local models with different sizes.
arXiv Detail & Related papers (2022-02-16T13:03:27Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - Efficient and Private Federated Learning with Partially Trainable
Networks [8.813191488656527]
We propose to leverage partially trainable neural networks, which freeze a portion of the model parameters during the entire training process.
We empirically show that Federated learning of Partially Trainable neural networks (FedPT) can result in superior communication-accuracy trade-offs.
Our approach also enables faster training, with a smaller memory footprint, and better utility for strong differential privacy guarantees.
arXiv Detail & Related papers (2021-10-06T04:28:33Z) - Federated Few-Shot Learning with Adversarial Learning [30.905239262227]
We propose a few-shot learning framework to learn a few-shot classification model that can classify unseen data classes with only a few labeled samples.
We show our approaches outperform baselines by more than 10% in learning vision tasks and 5% in language tasks.
arXiv Detail & Related papers (2021-04-01T09:44:57Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.