Related papers: Is Each Layer Non-trivial in CNN?

Is Each Layer Non-trivial in CNN?

URL: http://arxiv.org/abs/2009.09938v2
Date: Thu, 3 Dec 2020 02:23:09 GMT
Title: Is Each Layer Non-trivial in CNN?
Authors: Wei Wang, Yanjie Zhu, Zhuoxu Cui, Dong Liang
Abstract summary: Convolutional neural network (CNN) models have achieved great success in many fields. With the advent of ResNet, networks used in practice are getting deeper and wider. We trained a network on the training set, then we replace the network convolution kernels with zeros and test the result models on the test set.
Score: 11.854634156817642
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural network (CNN) models have achieved great success in many fields. With the advent of ResNet, networks used in practice are getting deeper and wider. However, is each layer non-trivial in networks? To answer this question, we trained a network on the training set, then we replace the network convolution kernels with zeros and test the result models on the test set. We compared experimental results with baseline and showed that we can reach similar or even the same performances. Although convolution kernels are the cores of networks, we demonstrate that some of them are trivial and regular in ResNet.

Related papers

Stitching for Neuroevolution: Recombining Deep Neural Networks without Breaking Them [0.0]
Traditional approaches to neuroevolution often start from scratch. Recombining trained networks is non-trivial because architectures and feature representations typically differ. We employ stitching, which merges the networks by introducing new layers at crossover points.
arXiv Detail & Related papers (2024-03-21T08:30:44Z)
You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious. We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem. We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z)
Logits are predictive of network type [47.64219291655723]
It is possible to predict which deep network has generated a given logit vector with accuracy well above chance. We utilize a number of networks on a dataset, with random weights or pretrained weights, as well as fine-tuned networks.
arXiv Detail & Related papers (2022-11-04T05:53:27Z)
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs. We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z)
Multipath Graph Convolutional Neural Networks [6.216778442751621]
We propose a novel Multipath Graph convolutional neural network that aggregates the output of multiple different shallow networks. Results show that the proposed method not only attains increased accuracy but also requires fewer training epochs to converge.
arXiv Detail & Related papers (2021-05-04T14:11:20Z)
Sifting out the features by pruning: Are convolutional networks the winning lottery ticket of fully connected ones? [16.5745082442791]
We study the inductive bias that pruning imprints in such "winning lottery tickets" We show that the surviving node connectivity is local in input space, and organized in patterns reminiscent of the ones found in convolutional networks (CNN)
arXiv Detail & Related papers (2021-04-27T17:25:54Z)
Computational Separation Between Convolutional and Fully-Connected Networks [35.39956227364153]
We show how convolutional networks can leverage locality in the data, and thus achieve a computational advantage over fully-connected networks. Specifically, we show a class of problems that can be efficiently solved using convolutional networks trained with gradient-descent.
arXiv Detail & Related papers (2020-10-03T14:24:59Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Analyzing Neural Networks Based on Random Graphs [77.34726150561087]
We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types. We find that none of the classical numerical graph invariants by itself allows to single out the best networks. We also find that networks with primarily short-range connections perform better than networks which allow for many long-range connections.
arXiv Detail & Related papers (2020-02-19T11:04:49Z)
Disentangling Trainability and Generalization in Deep Neural Networks [45.15453323967438]
We analyze the spectrum of the Neural Tangent Kernel (NTK) for trainability and generalization across a range of networks. We find that CNNs without global average pooling behave almost identically to FCNs, but that CNNs with pooling have markedly different and often better generalization performance.
arXiv Detail & Related papers (2019-12-30T18:53:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.