HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign
Supermask
- URL: http://arxiv.org/abs/2206.04385v1
- Date: Thu, 9 Jun 2022 09:55:31 GMT
- Title: HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign
Supermask
- Authors: Anish K. Vallapuram, Pengyuan Zhou, Young D. Kwon, Lik Hang Lee,
Hengwei Xu and Pan Hui
- Abstract summary: Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server.
Previous works have tackled these challenges by combining personalization with model compression schemes including quantization and pruning.
We propose HideNseek which employs one-shot data-agnostic pruning to get a subnetwork based on weights' synaptic saliency.
- Score: 11.067931264340931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning alleviates the privacy risk in distributed learning by
transmitting only the local model updates to the central server. However, it
faces challenges including statistical heterogeneity of clients' datasets and
resource constraints of client devices, which severely impact the training
performance and user experience. Prior works have tackled these challenges by
combining personalization with model compression schemes including quantization
and pruning. However, the pruning is data-dependent and thus must be done on
the client side which requires considerable computation cost. Moreover, the
pruning normally trains a binary supermask $\in \{0, 1\}$ which significantly
limits the model capacity yet with no computation benefit. Consequently, the
training requires high computation cost and a long time to converge while the
model performance does not pay off. In this work, we propose HideNseek which
employs one-shot data-agnostic pruning at initialization to get a subnetwork
based on weights' synaptic saliency. Each client then optimizes a sign
supermask $\in \{-1, +1\}$ multiplied by the unpruned weights to allow faster
convergence with the same compression rates as state-of-the-art. Empirical
results from three datasets demonstrate that compared to state-of-the-art,
HideNseek improves inferences accuracies by up to 40.6\% while reducing the
communication cost and training time by up to 39.7\% and 46.8\% respectively.
Related papers
- Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients [21.59433932637253]
In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets.
We propose a novel semi-asynchronous collaborative training framework, namely $Cotext-S2P$ with data distribution-aware structured pruning and cross-block knowledge transfer mechanism.
Experiments demonstrate that $Cotext-S2P$ improves accuracy by up to 8.8% and resource utilization by up to 1.2$times$ compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-11T02:17:50Z) - Fair Federated Data Clustering through Personalization: Bridging the Gap between Diverse Data Distributions [2.7905216619150344]
We introduce the idea of personalization in federated clustering. The goal is achieve balance between achieving lower clustering cost and at same time achieving uniform cost across clients.
We propose p-FClus that addresses these goal in a single round of communication between server and clients.
arXiv Detail & Related papers (2024-07-05T07:10:26Z) - Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization [65.85963235502322]
Federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead.
We propose a novel sparse FedAdam algorithm called FedAdam-SSM, wherein distributed devices sparsify the updates local model parameters and moment estimates.
By minimizing the divergence bound between the model trained by FedAdam-SSM and centralized Adam, we optimize the SSM to mitigate the learning performance degradation caused by sparsification error.
arXiv Detail & Related papers (2024-05-28T07:56:49Z) - Effective pruning of web-scale datasets based on complexity of concept
clusters [48.125618324485195]
We present a method for pruning large-scale multimodal datasets for training CLIP-style models on ImageNet.
We find that training on a smaller set of high-quality data can lead to higher performance with significantly lower training costs.
We achieve a new state-of-the-art Imagehttps://info.arxiv.org/help/prep#commentsNet zero-shot accuracy and a competitive average zero-shot accuracy on 38 evaluation tasks.
arXiv Detail & Related papers (2024-01-09T14:32:24Z) - Faster Federated Learning with Decaying Number of Local SGD Steps [23.447883712141422]
InNIST Learning (FL) devices collaboratively train a machine learning model without sharing their private data with a central or with other clients.
In this work we propose $K$ as training progresses, which can jointly improve the final performance of FL model.
arXiv Detail & Related papers (2023-05-16T17:36:34Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - Training Recommender Systems at Scale: Communication-Efficient Model and
Data Parallelism [56.78673028601739]
We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training.
DCT reduces communication by at least $100times$ and $20times$ during DP and MP, respectively.
It improves end-to-end training time for a state-of-the-art industrial recommender model by 37%, without any loss in performance.
arXiv Detail & Related papers (2020-10-18T01:44:42Z) - Corella: A Private Multi Server Learning Approach based on Correlated
Queries [30.3330177204504]
We propose $textitCorella$ as an alternative approach to protect the privacy of data.
The proposed scheme relies on a cluster of servers, where at most $T in mathbbN$ of them may collude, each running a learning model.
The variance of the noise is set to be large enough to make the information leakage to any subset of up to $T$ servers information-theoretically negligible.
arXiv Detail & Related papers (2020-03-26T17:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.