On the Existence of Universal Lottery Tickets
- URL: http://arxiv.org/abs/2111.11146v1
- Date: Mon, 22 Nov 2021 12:12:00 GMT
- Title: On the Existence of Universal Lottery Tickets
- Authors: Rebekka Burkholz, Nilanjana Laha, Rajarshi Mukherjee, Alkis Gotovos
- Abstract summary: Lottery ticket hypothesis conjectures existence of sparseworks of large randomly deep neural networks that can be successfully trained in isolation.
Recent work has experimentally observed that some of these tickets can be practically reused across a variety of tasks, hinting at some form of universality.
We formalize this concept and theoretically prove that not only do such universal tickets exist but they also do not require further training.
- Score: 2.5234156040689237
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The lottery ticket hypothesis conjectures the existence of sparse subnetworks
of large randomly initialized deep neural networks that can be successfully
trained in isolation. Recent work has experimentally observed that some of
these tickets can be practically reused across a variety of tasks, hinting at
some form of universality. We formalize this concept and theoretically prove
that not only do such universal tickets exist but they also do not require
further training. Our proofs introduce a couple of technical innovations
related to pruning for strong lottery tickets, including extensions of subset
sum results and a strategy to leverage higher amounts of depth. Our explicit
sparse constructions of universal function families might be of independent
interest, as they highlight representational benefits induced by univariate
convolutional architectures.
Related papers
- Iterative Magnitude Pruning as a Renormalisation Group: A Study in The
Context of The Lottery Ticket Hypothesis [0.0]
This thesis focuses on the Lottery Ticket Hypothesis (LTH)
The LTH posits that within extensive Deep Neural Networks (DNNs), smaller, trainable "winning tickets" can achieve performance comparable to the full model.
A key process in LTH, Iterative Magnitude Pruning (IMP), incrementally eliminates minimal weights, emulating stepwise learning in DNNs.
In other words, we check if a winning ticket that works well for one specific problem could also work well for other, similar problems.
arXiv Detail & Related papers (2023-08-06T14:36:57Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Convolutional and Residual Networks Provably Contain Lottery Tickets [6.68999512375737]
The Lottery Ticket Hypothesis continues to have a profound practical impact on the quest for deep neural networks that solve modern deep learning tasks at competitive performance.
We prove that also modern architectures consisting of convolutional and residual layers that can be equipped with almost arbitrary activation functions can contain lottery tickets with high probability.
arXiv Detail & Related papers (2022-05-04T22:20:01Z) - Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets [127.56361320894861]
Lottery ticket hypothesis (LTH) has shown that dense models contain highly sparseworks (i.e., winning tickets) that can be trained in isolation to match full accuracy.
In this paper, we demonstrate the first positive result that a structurally sparse winning ticket can be effectively found in general.
Specifically, we first "re-fill" pruned elements back in some channels deemed to be important, and then "re-group" non-zero elements to create flexible group-wise structural patterns.
arXiv Detail & Related papers (2022-02-09T21:33:51Z) - Towards strong pruning for lottery tickets with non-zero biases [6.85316573653194]
Lottery ticket hypothesis holds promise that pruning randomly deep neural networks could offer efficient alternative to deep learning.
Common parameter schemes and existence proofs are focused on networks with zero gradient biases.
We extend these schemes and existence proofs to non-zero biases, including explicit 'looks-linear' approaches for ReLU activation functions.
arXiv Detail & Related papers (2021-10-21T13:56:04Z) - Universality of Deep Neural Network Lottery Tickets: A Renormalization
Group Perspective [89.19516919095904]
Winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.
We make use of renormalization group theory, one of the most successful tools in theoretical physics.
We leverage here to examine winning ticket universality in large scale lottery ticket experiments, as well as sheds new light on the success iterative magnitude pruning has found in the field of sparse machine learning.
arXiv Detail & Related papers (2021-10-07T06:50:16Z) - Towards Understanding Iterative Magnitude Pruning: Why Lottery Tickets
Win [20.97456178983006]
Lottery ticket hypothesis states that sparseworks exist in randomly dense networks that can be trained to the same accuracy as the dense network they reside in.
We show that by using a training method that is stable with respect to linear mode connectivity, large networks can also be entirely rewound to initialization.
arXiv Detail & Related papers (2021-06-13T10:06:06Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Coupling-based Invertible Neural Networks Are Universal Diffeomorphism
Approximators [72.62940905965267]
Invertible neural networks based on coupling flows (CF-INNs) have various machine learning applications such as image synthesis and representation learning.
Are CF-INNs universal approximators for invertible functions?
We prove a general theorem to show the equivalence of the universality for certain diffeomorphism classes.
arXiv Detail & Related papers (2020-06-20T02:07:37Z) - Towards Practical Lottery Ticket Hypothesis for Adversarial Training [78.30684998080346]
We show there exists a subset of the aforementioned sub-networks that converge significantly faster during the training process.
As a practical application of our findings, we demonstrate that such sub-networks can help in cutting down the total time of adversarial training.
arXiv Detail & Related papers (2020-03-06T03:11:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.