Related papers: Federated Learning of Large Models at the Edge via Principal Sub-Model Training

Federated Learning of Large Models at the Edge via Principal Sub-Model Training

URL: http://arxiv.org/abs/2208.13141v3
Date: Tue, 10 Oct 2023 23:04:48 GMT
Title: Federated Learning of Large Models at the Edge via Principal Sub-Model Training
Authors: Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr
Abstract summary: Federated Learning (FL) is emerging as a popular, promising decentralized learning framework that enables collaborative training among clients. We develop a principal sub-model (PriSM) training methodology to collaboratively train a full large model, while assigning each client a small sub-model that is a probabilistic low-rank approximation to the full server model.
Score: 22.54297471651581
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated Learning (FL) is emerging as a popular, promising decentralized learning framework that enables collaborative training among clients, with no need to share private data between them or to a centralized server. However, considering many edge clients do not have sufficient computing, memory, or communication capabilities, federated learning of large models still faces significant bottlenecks. To keep such weak but crucial clients in the loop, prior works either consider a heterogeneous-client setting where clients train models with different sizes; or offload training to the server. However, the heterogeneous-client setting requires some clients to train full model, which is not aligned with the resource-constrained setting; while the latter ones break privacy promises in FL when sharing intermediate representations or labels with the server. To overcome these limitations, in this work, we formulate a realistic, but much less explored, cross-device FL setting in which no client can train a full large model nor is willing to share any intermediate information with the remote server. Under such a formulation, we develop a principal sub-model (PriSM) training methodology to collaboratively train a full large model, while assigning each client a small sub-model that is a probabilistic low-rank approximation to the full server model. When creating sub-models, PriSM first performs a principal kernel analysis in the orthogonal kernel space to obtain importance of each kernel. Then, PriSM adopts a novel importance-aware sampling process to select a subset of kernels (i.e., a kernel with high importance is assigned with a higher sampling probability). This sampling process ensures each sub-model is still a low-rank approximation to the full model, while all sub-models together achieve nearly full coverage on the principal kernels.

Related papers

FedMerge: Federated Personalization via Model Merging [51.12769696559237]
One global model might not be sufficient to serve many clients with non-IID tasks and distributions. We propose a novel FedMerge'' approach that can create a personalized model per client by simply merging multiple global models. We evaluate FedMerge on three different non-IID settings applied to different domains with diverse tasks and data types.
arXiv Detail & Related papers (2025-04-09T10:44:14Z)
FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients [25.847042398060616]
Federated Learning (FL) facilitates collaborative training of a shared global model without exposing clients' private data. We propose FedConv, a client-friendly FL framework, which minimizes the computation and memory burden on resource-constrained clients. We show that FedConv outperforms state-of-the-art FL systems in terms of model accuracy, computation and communication overhead.
arXiv Detail & Related papers (2025-02-28T01:39:53Z)
Personalized Hierarchical Split Federated Learning in Wireless Networks [24.664469755746463]
We propose a personalized hierarchical split federated learning (PHSFL) algorithm that is specially designed to achieve better personalization performance. We first perform extensive theoretical analysis to understand the impact of model splitting and hierarchical model aggregations on the global model. Once the global model is trained, we fine-tune each client to obtain the personalized models.
arXiv Detail & Related papers (2024-11-09T02:41:53Z)
Multi-Level Additive Modeling for Structured Non-IID Federated Learning [54.53672323071204]
We train models organized in a multi-level structure, called Multi-level Additive Models (MAM)'', for better knowledge-sharing across heterogeneous clients. In federated MAM (FeMAM), each client is assigned to at most one model per level and its personalized prediction sums up the outputs of models assigned to it across all levels. Experiments show that FeMAM surpasses existing clustered FL and personalized FL methods in various non-IID settings.
arXiv Detail & Related papers (2024-05-26T07:54:53Z)
Towards Client Driven Federated Learning [7.528642177161784]
We introduce Client-Driven Federated Learning (CDFL), a novel FL framework that puts clients at the driving role. In CDFL, each client independently and asynchronously updates its model by uploading the locally trained model to the server and receiving a customized model tailored to its local task.
arXiv Detail & Related papers (2024-05-24T10:17:49Z)
PFSL: Personalized & Fair Split Learning with Data & Label Privacy for thin clients [0.5144809478361603]
PFSL is a new framework of distributed split learning where a large number of thin clients perform transfer learning in parallel. We implement a lightweight step of personalization of client models to provide high performance for their respective data distributions. Our accuracy far exceeds that of current algorithms SL and is very close to that of centralized learning on several real-life benchmarks.
arXiv Detail & Related papers (2023-03-19T10:38:29Z)
Complement Sparsification: Low-Overhead Model Pruning for Federated Learning [2.0428960719376166]
Federated Learning (FL) is a privacy-preserving distributed deep learning paradigm that involves substantial communication and computation effort. Existing model pruning/sparsification solutions cannot satisfy the requirements for low bidirectional communication overhead between the server and the clients. We propose Complement Sparsification (CS), a pruning mechanism that satisfies all these requirements through a complementary and collaborative pruning done at the server and the clients.
arXiv Detail & Related papers (2023-03-10T23:07:02Z)
SplitGP: Achieving Both Generalization and Personalization in Federated Learning [31.105681433459285]
SplitGP captures generalization and personalization capabilities for efficient inference across resource-constrained clients. We analytically characterize the convergence behavior of SplitGP, revealing that all client models approach stationary pointsally. Experimental results show that SplitGP outperforms existing baselines by wide margins in inference time and test accuracy for varying amounts of out-of-distribution samples.
arXiv Detail & Related papers (2022-12-16T08:37:24Z)
Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device) In FL, each data holder trains a model locally and releases it to a central server for aggregation. In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation). In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z)
Optimizing Server-side Aggregation For Robust Federated Learning via Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems. We propose SmartFL, a generic approach that optimize the server-side aggregation process. We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z)
An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device. In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters. We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z)
A Bayesian Federated Learning Framework with Online Laplace Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model. We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side. We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z)
Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.