Logarithmic Pruning is All You Need
- URL: http://arxiv.org/abs/2006.12156v2
- Date: Sun, 25 Oct 2020 18:45:05 GMT
- Title: Logarithmic Pruning is All You Need
- Authors: Laurent Orseau, Marcus Hutter, Omar Rivasplata
- Abstract summary: Lottery Ticket Hypothesis: Every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network.
This latter result, however, relies on a number of strong assumptions and guarantees a factor on the size of the large network compared to the target function.
- Score: 30.330326149079326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Lottery Ticket Hypothesis is a conjecture that every large neural network
contains a subnetwork that, when trained in isolation, achieves comparable
performance to the large network. An even stronger conjecture has been proven
recently: Every sufficiently overparameterized network contains a subnetwork
that, at random initialization, but without training, achieves comparable
accuracy to the trained large network. This latter result, however, relies on a
number of strong assumptions and guarantees a polynomial factor on the size of
the large network compared to the target function. In this work, we remove the
most limiting assumptions of this previous work while providing significantly
tighter bounds:the overparameterized network only needs a logarithmic factor
(in all variables but depth) number of neurons per weight of the target
subnetwork.
Related papers
- Network Degeneracy as an Indicator of Training Performance: Comparing
Finite and Infinite Width Angle Predictions [3.04585143845864]
We show that as networks get deeper and deeper, they are more susceptible to becoming degenerate.
We use a simple algorithm that can accurately predict the level of degeneracy for any given fully connected ReLU network architecture.
arXiv Detail & Related papers (2023-06-02T13:02:52Z) - On the High Symmetry of Neural Network Functions [0.0]
Training neural networks means solving a high-dimensional optimization problem.
This paper shows how due to how neural networks are designed, the neural network function present a very large symmetry in the parameter space.
arXiv Detail & Related papers (2022-11-12T07:51:14Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Finding Everything within Random Binary Networks [11.689913953698081]
We prove that a random network can be approximated up to arbitrary accuracy by simply pruning a random network of binary $pm1$ weights.
We prove that any target network can be approximated up to arbitrary accuracy by simply pruning a random network of binary $pm1$ weights that is only a polylogarithmic factor wider and deeper than the target network.
arXiv Detail & Related papers (2021-10-18T03:19:25Z) - It's Hard for Neural Networks To Learn the Game of Life [4.061135251278187]
Recent findings suggest that neural networks rely on lucky random initial weights of "lottery tickets" that converge quickly to a solution.
We examine small convolutional networks that are trained to predict n steps of the two-dimensional cellular automaton Conway's Game of Life.
We find that networks of this architecture trained on this task rarely converge.
arXiv Detail & Related papers (2020-09-03T00:47:08Z) - Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks.
Centered and ensembled finite networks have reduced posterior variance.
Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Proving the Lottery Ticket Hypothesis: Pruning is All You Need [56.25432563818297]
The lottery ticket hypothesis states that a randomly-d network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network.
We prove an even stronger hypothesis, showing that for every bounded distribution and every target network with bounded weights, a sufficiently over- parameterized neural network with random weights contains a subnetwork with roughly the same accuracy as the target network, without any further training.
arXiv Detail & Related papers (2020-02-03T07:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.