Sparse Random Networks for Communication-Efficient Federated Learning
- URL: http://arxiv.org/abs/2209.15328v1
- Date: Fri, 30 Sep 2022 09:11:09 GMT
- Title: Sparse Random Networks for Communication-Efficient Federated Learning
- Authors: Berivan Isik, Francesco Pase, Deniz Gunduz, Tsachy Weissman, Michele
Zorzi
- Abstract summary: One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round.
We propose a radically different approach that does not update the weights at all.
Instead, our method freezes the weights at their initial emphrandom values and learns how to sparsify the random network for the best performance.
- Score: 23.614934319624826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One main challenge in federated learning is the large communication cost of
exchanging weight updates from clients to the server at each round. While prior
work has made great progress in compressing the weight updates through gradient
compression methods, we propose a radically different approach that does not
update the weights at all. Instead, our method freezes the weights at their
initial \emph{random} values and learns how to sparsify the random network for
the best performance. To this end, the clients collaborate in training a
\emph{stochastic} binary mask to find the optimal sparse random network within
the original one. At the end of the training, the final model is a sparse
network with random weights -- or a subnetwork inside the dense random network.
We show improvements in accuracy, communication (less than $1$ bit per
parameter (bpp)), convergence speed, and final model size (less than $1$ bpp)
over relevant baselines on MNIST, EMNIST, CIFAR-10, and CIFAR-100 datasets, in
the low bitrate regime under various system configurations.
Related papers
- Stochastic Approximation Approach to Federated Machine Learning [0.0]
This paper examines Federated learning (FL) in a Approximation (SA) framework.
FL is a collaborative way to train neural network models across various participants or clients.
It is observed that the proposed algorithm is robust and gives more reliable estimates of the weights.
arXiv Detail & Related papers (2024-02-20T12:00:25Z) - Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights.
This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z) - Random Weights Networks Work as Loss Prior Constraint for Image
Restoration [50.80507007507757]
We present our belief Random Weights Networks can be Acted as Loss Prior Constraint for Image Restoration''
Our belief can be directly inserted into existing networks without any training and testing computational cost.
To emphasize, our main focus is to spark the realms of loss function and save their current neglected status.
arXiv Detail & Related papers (2023-03-29T03:43:51Z) - Training Your Sparse Neural Network Better with Any Mask [106.134361318518]
Pruning large neural networks to create high-quality, independently trainable sparse masks is desirable.
In this paper we demonstrate an alternative opportunity: one can customize the sparse training techniques to deviate from the default dense network training protocols.
Our new sparse training recipe is generally applicable to improving training from scratch with various sparse masks.
arXiv Detail & Related papers (2022-06-26T00:37:33Z) - Fast Conditional Network Compression Using Bayesian HyperNetworks [54.06346724244786]
We introduce a conditional compression problem and propose a fast framework for tackling it.
The problem is how to quickly compress a pretrained large neural network into optimal smaller networks given target contexts.
Our methods can quickly generate compressed networks with significantly smaller sizes than baseline methods.
arXiv Detail & Related papers (2022-05-13T00:28:35Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural
Networks by Pruning A Randomly Weighted Network [13.193734014710582]
We propose an algorithm for finding multi-prize tickets (MPTs) and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets.
Our MPTs-1/32 not only set new binary weight network state-of-the-art (SOTA) Top-1 accuracy -- 94.8% on CIFAR-10 and 74.03% on ImageNet -- but also outperform their full-precision counterparts by 1.78% and 0.76%, respectively.
arXiv Detail & Related papers (2021-03-17T00:31:24Z) - Slot Machines: Discovering Winning Combinations of Random Weights in
Neural Networks [40.43730385915566]
We show the existence of effective random networks whose weights are never updated.
We refer to our networks as "slot machines" where each reel (connection) contains a fixed set of symbols (random values)
We find that allocating just a few random values to each connection yields highly competitive combinations.
arXiv Detail & Related papers (2021-01-16T16:56:48Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z) - Training highly effective connectivities within neural networks with
randomly initialized, fixed weights [4.56877715768796]
We introduce a novel way of training a network by flipping the signs of the weights.
We obtain good results even with weights constant magnitude or even when weights are drawn from highly asymmetric distributions.
arXiv Detail & Related papers (2020-06-30T09:41:18Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.