Complement Sparsification: Low-Overhead Model Pruning for Federated
Learning
- URL: http://arxiv.org/abs/2303.06237v1
- Date: Fri, 10 Mar 2023 23:07:02 GMT
- Title: Complement Sparsification: Low-Overhead Model Pruning for Federated
Learning
- Authors: Xiaopeng Jiang, Cristian Borcea
- Abstract summary: Federated Learning (FL) is a privacy-preserving distributed deep learning paradigm that involves substantial communication and computation effort.
Existing model pruning/sparsification solutions cannot satisfy the requirements for low bidirectional communication overhead between the server and the clients.
We propose Complement Sparsification (CS), a pruning mechanism that satisfies all these requirements through a complementary and collaborative pruning done at the server and the clients.
- Score: 2.0428960719376166
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) is a privacy-preserving distributed deep learning
paradigm that involves substantial communication and computation effort, which
is a problem for resource-constrained mobile and IoT devices. Model
pruning/sparsification develops sparse models that could solve this problem,
but existing sparsification solutions cannot satisfy at the same time the
requirements for low bidirectional communication overhead between the server
and the clients, low computation overhead at the clients, and good model
accuracy, under the FL assumption that the server does not have access to raw
data to fine-tune the pruned models. We propose Complement Sparsification (CS),
a pruning mechanism that satisfies all these requirements through a
complementary and collaborative pruning done at the server and the clients. At
each round, CS creates a global sparse model that contains the weights that
capture the general data distribution of all clients, while the clients create
local sparse models with the weights pruned from the global model to capture
the local trends. For improved model performance, these two types of
complementary sparse models are aggregated into a dense model in each round,
which is subsequently pruned in an iterative process. CS requires little
computation overhead on the top of vanilla FL for both the server and the
clients. We demonstrate that CS is an approximation of vanilla FL and, thus,
its models perform well. We evaluate CS experimentally with two popular FL
benchmark datasets. CS achieves substantial reduction in bidirectional
communication, while achieving performance comparable with vanilla FL. In
addition, CS outperforms baseline pruning mechanisms for FL.
Related papers
- Efficient Model Compression for Hierarchical Federated Learning [10.37403547348343]
Federated learning (FL) has garnered significant attention due to its capacity to preserve privacy within distributed learning systems.
This paper introduces a novel hierarchical FL framework that integrates the benefits of clustered FL and model compression.
arXiv Detail & Related papers (2024-05-27T12:17:47Z) - Towards Client Driven Federated Learning [7.528642177161784]
We introduce Client-Driven Federated Learning (CDFL), a novel FL framework that puts clients at the driving role.
In CDFL, each client independently and asynchronously updates its model by uploading the locally trained model to the server and receiving a customized model tailored to its local task.
arXiv Detail & Related papers (2024-05-24T10:17:49Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Enhancing One-Shot Federated Learning Through Data and Ensemble
Co-Boosting [76.64235084279292]
One-shot Federated Learning (OFL) has become a promising learning paradigm, enabling the training of a global server model via a single communication round.
We introduce a novel framework, Co-Boosting, in which synthesized data and the ensemble model mutually enhance each other progressively.
arXiv Detail & Related papers (2024-02-23T03:15:10Z) - Fed-CVLC: Compressing Federated Learning Communications with
Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds.
We show strong evidences that variable-length is beneficial for compression in FL.
We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - FedCliP: Federated Learning with Client Pruning [3.796320380104124]
Federated learning (FL) is a newly emerging distributed learning paradigm.
One fundamental bottleneck in FL is the heavy communication overheads between the distributed clients and the central server.
We propose FedCliP, the first communication efficient FL training framework from a macro perspective.
arXiv Detail & Related papers (2023-01-17T09:15:37Z) - Latency Aware Semi-synchronous Client Selection and Model Aggregation
for Wireless Federated Learning [0.6882042556551609]
Federated learning (FL) is a collaborative machine learning framework that requires different clients (e.g., Internet of Things devices) to participate in the machine learning model training process.
Traditional FL process may suffer from the straggler problem in heterogeneous client settings.
We propose a Semisynchronous-client Selection and mOdel aggregation aggregation for federated learNing (LESSON) method that allows all the clients to participate in the whole FL process but with different frequencies.
arXiv Detail & Related papers (2022-10-19T05:59:22Z) - Federated Learning of Large Models at the Edge via Principal Sub-Model
Training [22.54297471651581]
Federated Learning (FL) is emerging as a popular, promising decentralized learning framework that enables collaborative training among clients.
We develop a principal sub-model (PriSM) training methodology to collaboratively train a full large model, while assigning each client a small sub-model that is a probabilistic low-rank approximation to the full server model.
arXiv Detail & Related papers (2022-08-28T05:17:03Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.