Federated Learning of Large Models at the Edge via Principal Sub-Model
Training
- URL: http://arxiv.org/abs/2208.13141v3
- Date: Tue, 10 Oct 2023 23:04:48 GMT
- Title: Federated Learning of Large Models at the Edge via Principal Sub-Model
Training
- Authors: Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr
- Abstract summary: Federated Learning (FL) is emerging as a popular, promising decentralized learning framework that enables collaborative training among clients.
We develop a principal sub-model (PriSM) training methodology to collaboratively train a full large model, while assigning each client a small sub-model that is a probabilistic low-rank approximation to the full server model.
- Score: 22.54297471651581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is emerging as a popular, promising decentralized
learning framework that enables collaborative training among clients, with no
need to share private data between them or to a centralized server. However,
considering many edge clients do not have sufficient computing, memory, or
communication capabilities, federated learning of large models still faces
significant bottlenecks. To keep such weak but crucial clients in the loop,
prior works either consider a heterogeneous-client setting where clients train
models with different sizes; or offload training to the server. However, the
heterogeneous-client setting requires some clients to train full model, which
is not aligned with the resource-constrained setting; while the latter ones
break privacy promises in FL when sharing intermediate representations or
labels with the server. To overcome these limitations, in this work, we
formulate a realistic, but much less explored, cross-device FL setting in which
no client can train a full large model nor is willing to share any intermediate
information with the remote server. Under such a formulation, we develop a
principal sub-model (PriSM) training methodology to collaboratively train a
full large model, while assigning each client a small sub-model that is a
probabilistic low-rank approximation to the full server model. When creating
sub-models, PriSM first performs a principal kernel analysis in the orthogonal
kernel space to obtain importance of each kernel. Then, PriSM adopts a novel
importance-aware sampling process to select a subset of kernels (i.e., a kernel
with high importance is assigned with a higher sampling probability). This
sampling process ensures each sub-model is still a low-rank approximation to
the full model, while all sub-models together achieve nearly full coverage on
the principal kernels.
Related papers
- Multi-Level Additive Modeling for Structured Non-IID Federated Learning [54.53672323071204]
We train models organized in a multi-level structure, called Multi-level Additive Models (MAM)'', for better knowledge-sharing across heterogeneous clients.
In federated MAM (FeMAM), each client is assigned to at most one model per level and its personalized prediction sums up the outputs of models assigned to it across all levels.
Experiments show that FeMAM surpasses existing clustered FL and personalized FL methods in various non-IID settings.
arXiv Detail & Related papers (2024-05-26T07:54:53Z) - Towards Client Driven Federated Learning [7.528642177161784]
We introduce Client-Driven Federated Learning (CDFL), a novel FL framework that puts clients at the driving role.
In CDFL, each client independently and asynchronously updates its model by uploading the locally trained model to the server and receiving a customized model tailored to its local task.
arXiv Detail & Related papers (2024-05-24T10:17:49Z) - PFSL: Personalized & Fair Split Learning with Data & Label Privacy for
thin clients [0.5144809478361603]
PFSL is a new framework of distributed split learning where a large number of thin clients perform transfer learning in parallel.
We implement a lightweight step of personalization of client models to provide high performance for their respective data distributions.
Our accuracy far exceeds that of current algorithms SL and is very close to that of centralized learning on several real-life benchmarks.
arXiv Detail & Related papers (2023-03-19T10:38:29Z) - Complement Sparsification: Low-Overhead Model Pruning for Federated
Learning [2.0428960719376166]
Federated Learning (FL) is a privacy-preserving distributed deep learning paradigm that involves substantial communication and computation effort.
Existing model pruning/sparsification solutions cannot satisfy the requirements for low bidirectional communication overhead between the server and the clients.
We propose Complement Sparsification (CS), a pruning mechanism that satisfies all these requirements through a complementary and collaborative pruning done at the server and the clients.
arXiv Detail & Related papers (2023-03-10T23:07:02Z) - SplitGP: Achieving Both Generalization and Personalization in Federated
Learning [31.105681433459285]
SplitGP captures generalization and personalization capabilities for efficient inference across resource-constrained clients.
We analytically characterize the convergence behavior of SplitGP, revealing that all client models approach stationary pointsally.
Experimental results show that SplitGP outperforms existing baselines by wide margins in inference time and test accuracy for varying amounts of out-of-distribution samples.
arXiv Detail & Related papers (2022-12-16T08:37:24Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Optimizing Server-side Aggregation For Robust Federated Learning via
Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems.
We propose SmartFL, a generic approach that optimize the server-side aggregation process.
We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - Personalized Federated Learning using Hypernetworks [26.329820911200546]
We propose pFedHN for personalized Federated HyperNetworks.
In this approach, a central hypernetwork model is trained to generate a set of models, one model for each client.
We show that pFedHN can generalize better to new clients whose distributions differ from any client observed during training.
arXiv Detail & Related papers (2021-03-08T09:29:08Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.