Related papers: Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

URL: http://arxiv.org/abs/2101.06475v1
Date: Sat, 16 Jan 2021 16:56:48 GMT
Title: Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks
Authors: Maxwell Mbabilla Aladago and Lorenzo Torresani
Abstract summary: We show the existence of effective random networks whose weights are never updated. We refer to our networks as "slot machines" where each reel (connection) contains a fixed set of symbols (random values) We find that allocating just a few random values to each connection yields highly competitive combinations.
Score: 40.43730385915566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In contrast to traditional weight optimization in a continuous space, we demonstrate the existence of effective random networks whose weights are never updated. By selecting a weight among a fixed set of random values for each individual connection, our method uncovers combinations of random weights that match the performance of traditionally-trained networks of the same capacity. We refer to our networks as "slot machines" where each reel (connection) contains a fixed set of symbols (random values). Our backpropagation algorithm "spins" the reels to seek "winning" combinations, i.e., selections of random weight values that minimize the given loss. Quite surprisingly, we find that allocating just a few random values to each connection (e.g., 8 values per connection) yields highly competitive combinations despite being dramatically more constrained compared to traditionally learned weights. Moreover, finetuning these combinations often improves performance over the trained baselines. A randomly initialized VGG-19 with 8 values per connection contains a combination that achieves 90% test accuracy on CIFAR-10. Our method also achieves an impressive performance of 98.1% on MNIST for neural networks containing only random weights.

Related papers

No Free Lunch From Random Feature Ensembles [23.661623767100384]
Given a budget on total model size, one must decide whether to train a single, large neural network or to combine the predictions of many smaller networks. We prove that when a fixed number of trainable parameters are partitioned among $K$ independently trained models, $K=1$ achieves optimal performance. We identify conditions on the kernel and task eigenstructure under which ensembles can achieve near-optimal scaling laws.
arXiv Detail & Related papers (2024-12-06T20:55:27Z)
Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights. This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z)
Sparse Random Networks for Communication-Efficient Federated Learning [23.614934319624826]
One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. We propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial emphrandom values and learns how to sparsify the random network for the best performance.
arXiv Detail & Related papers (2022-09-30T09:11:09Z)
Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity. In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark. We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z)
Bit-wise Training of Neural Network Weights [4.56877715768796]
We introduce an algorithm where the individual bits representing the weights of a neural network are learned. This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks. We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks.
arXiv Detail & Related papers (2022-02-19T10:46:54Z)
Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network [13.193734014710582]
We propose an algorithm for finding multi-prize tickets (MPTs) and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets. Our MPTs-1/32 not only set new binary weight network state-of-the-art (SOTA) Top-1 accuracy -- 94.8% on CIFAR-10 and 74.03% on ImageNet -- but also outperform their full-precision counterparts by 1.78% and 0.76%, respectively.
arXiv Detail & Related papers (2021-03-17T00:31:24Z)
Searching for Low-Bit Weights in Quantized Neural Networks [129.8319019563356]
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. We present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.
arXiv Detail & Related papers (2020-09-18T09:13:26Z)
Training highly effective connectivities within neural networks with randomly initialized, fixed weights [4.56877715768796]
We introduce a novel way of training a network by flipping the signs of the weights. We obtain good results even with weights constant magnitude or even when weights are drawn from highly asymmetric distributions.
arXiv Detail & Related papers (2020-06-30T09:41:18Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
Proving the Lottery Ticket Hypothesis: Pruning is All You Need [56.25432563818297]
The lottery ticket hypothesis states that a randomly-d network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network. We prove an even stronger hypothesis, showing that for every bounded distribution and every target network with bounded weights, a sufficiently over- parameterized neural network with random weights contains a subnetwork with roughly the same accuracy as the target network, without any further training.
arXiv Detail & Related papers (2020-02-03T07:23:11Z)
RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks [23.45606380793965]
We present Random Partition Relaxation (RPR), a method for strong quantization of neural networks weight to binary (+1/-1) and ternary (+1/0/-1) values. We demonstrate binary and ternary-weight networks with accuracies beyond the state-of-the-art for GoogLeNet and competitive performance for ResNet-18 and ResNet-50.
arXiv Detail & Related papers (2020-01-04T15:56:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.