Related papers: HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

URL: http://arxiv.org/abs/2206.04385v1
Date: Thu, 9 Jun 2022 09:55:31 GMT
Title: HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask
Authors: Anish K. Vallapuram, Pengyuan Zhou, Young D. Kwon, Lik Hang Lee, Hengwei Xu and Pan Hui
Abstract summary: Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. Previous works have tackled these challenges by combining personalization with model compression schemes including quantization and pruning. We propose HideNseek which employs one-shot data-agnostic pruning to get a subnetwork based on weights' synaptic saliency.
Score: 11.067931264340931
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. However, it faces challenges including statistical heterogeneity of clients' datasets and resource constraints of client devices, which severely impact the training performance and user experience. Prior works have tackled these challenges by combining personalization with model compression schemes including quantization and pruning. However, the pruning is data-dependent and thus must be done on the client side which requires considerable computation cost. Moreover, the pruning normally trains a binary supermask $\in \{0, 1\}$ which significantly limits the model capacity yet with no computation benefit. Consequently, the training requires high computation cost and a long time to converge while the model performance does not pay off. In this work, we propose HideNseek which employs one-shot data-agnostic pruning at initialization to get a subnetwork based on weights' synaptic saliency. Each client then optimizes a sign supermask $\in \{-1, +1\}$ multiplied by the unpruned weights to allow faster convergence with the same compression rates as state-of-the-art. Empirical results from three datasets demonstrate that compared to state-of-the-art, HideNseek improves inferences accuracies by up to 40.6\% while reducing the communication cost and training time by up to 39.7\% and 46.8\% respectively.

Related papers

Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients [21.59433932637253]
In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets. We propose a novel semi-asynchronous collaborative training framework, namely $Cotext-S2P$ with data distribution-aware structured pruning and cross-block knowledge transfer mechanism. Experiments demonstrate that $Cotext-S2P$ improves accuracy by up to 8.8% and resource utilization by up to 1.2$times$ compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-11T02:17:50Z)
Fair Federated Data Clustering through Personalization: Bridging the Gap between Diverse Data Distributions [2.7905216619150344]
We introduce the idea of personalization in federated clustering. The goal is achieve balance between achieving lower clustering cost and at same time achieving uniform cost across clients. We propose p-FClus that addresses these goal in a single round of communication between server and clients.
arXiv Detail & Related papers (2024-07-05T07:10:26Z)
Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization [65.85963235502322]
Federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead. We propose a novel sparse FedAdam algorithm called FedAdam-SSM, wherein distributed devices sparsify the updates local model parameters and moment estimates. By minimizing the divergence bound between the model trained by FedAdam-SSM and centralized Adam, we optimize the SSM to mitigate the learning performance degradation caused by sparsification error.
arXiv Detail & Related papers (2024-05-28T07:56:49Z)
Effective pruning of web-scale datasets based on complexity of concept clusters [48.125618324485195]
We present a method for pruning large-scale multimodal datasets for training CLIP-style models on ImageNet. We find that training on a smaller set of high-quality data can lead to higher performance with significantly lower training costs. We achieve a new state-of-the-art Imagehttps://info.arxiv.org/help/prep#commentsNet zero-shot accuracy and a competitive average zero-shot accuracy on 38 evaluation tasks.
arXiv Detail & Related papers (2024-01-09T14:32:24Z)
Faster Federated Learning with Decaying Number of Local SGD Steps [23.447883712141422]
InNIST Learning (FL) devices collaboratively train a machine learning model without sharing their private data with a central or with other clients. In this work we propose $K$ as training progresses, which can jointly improve the final performance of FL model.
arXiv Detail & Related papers (2023-05-16T17:36:34Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning. ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models. Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z)
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism [56.78673028601739]
We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT reduces communication by at least $100times$ and $20times$ during DP and MP, respectively. It improves end-to-end training time for a state-of-the-art industrial recommender model by 37%, without any loss in performance.
arXiv Detail & Related papers (2020-10-18T01:44:42Z)
Corella: A Private Multi Server Learning Approach based on Correlated Queries [30.3330177204504]
We propose $textitCorella$ as an alternative approach to protect the privacy of data. The proposed scheme relies on a cluster of servers, where at most $T in mathbbN$ of them may collude, each running a learning model. The variance of the noise is set to be large enough to make the information leakage to any subset of up to $T$ servers information-theoretically negligible.
arXiv Detail & Related papers (2020-03-26T17:44:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.