Sifting out the features by pruning: Are convolutional networks the
winning lottery ticket of fully connected ones?
- URL: http://arxiv.org/abs/2104.13343v1
- Date: Tue, 27 Apr 2021 17:25:54 GMT
- Title: Sifting out the features by pruning: Are convolutional networks the
winning lottery ticket of fully connected ones?
- Authors: Franco Pellegrini, Giulio Biroli
- Abstract summary: We study the inductive bias that pruning imprints in such "winning lottery tickets"
We show that the surviving node connectivity is local in input space, and organized in patterns reminiscent of the ones found in convolutional networks (CNN)
- Score: 16.5745082442791
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pruning methods can considerably reduce the size of artificial neural
networks without harming their performance. In some cases, they can even
uncover sub-networks that, when trained in isolation, match or surpass the test
accuracy of their dense counterparts. Here we study the inductive bias that
pruning imprints in such "winning lottery tickets". Focusing on visual tasks,
we analyze the architecture resulting from iterative magnitude pruning of a
simple fully connected network (FCN). We show that the surviving node
connectivity is local in input space, and organized in patterns reminiscent of
the ones found in convolutional networks (CNN). We investigate the role played
by data and tasks in shaping the pruned sub-networks. Our results show that the
winning lottery tickets of FCNs display the key features of CNNs. The ability
of such automatic network-simplifying procedure to recover the key features
"hand-crafted" in the design of CNNs suggests interesting applications to other
datasets and tasks, in order to discover new and efficient architectural
inductive biases.
Related papers
- Investigating Sparsity in Recurrent Neural Networks [0.0]
This thesis focuses on investigating the effects of pruning and Sparse Recurrent Neural Networks on the performance of RNNs.
We first describe the pruning of RNNs, its impact on the performance of RNNs, and the number of training epochs required to regain accuracy after the pruning is performed.
Next, we continue with the creation and training of Sparse Recurrent Neural Networks and identify the relation between the performance and the graph properties of its underlying arbitrary structure.
arXiv Detail & Related papers (2024-07-30T07:24:58Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural
Network [69.54809052377189]
Recently many plug-and-play self-attention modules (SAMs) are proposed to enhance the model generalization by exploiting the internal information of deep convolutional neural networks (CNNs)
We empirically find and verify some counterintuitive phenomena that: (a) Connecting the SAMs to all the blocks may not always bring the largest performance boost, and connecting to partial blocks would be even better; (b) Adding the SAMs to a CNN may not always bring a performance boost, and instead it may even harm the performance of the original CNN backbone.
arXiv Detail & Related papers (2022-07-16T07:08:59Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Entangled q-Convolutional Neural Nets [0.0]
We introduce a machine learning model, the q-CNN model, sharing key features with convolutional neural networks and admitting a tensor network description.
As examples, we apply q-CNN to the MNIST and Fashion MNIST classification tasks.
We explain how the network associates a quantum state to each classification label, and study the entanglement structure of these network states.
arXiv Detail & Related papers (2021-03-06T02:35:52Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - Internal representation dynamics and geometry in recurrent neural
networks [10.016265742591674]
We show how a vanilla RNN implements a simple classification task by analysing the dynamics of the network.
We find that early internal representations are evocative of the real labels of the data but this information is not directly accessible to the output layer.
arXiv Detail & Related papers (2020-01-09T23:19:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.