Computational Separation Between Convolutional and Fully-Connected
Networks
- URL: http://arxiv.org/abs/2010.01369v1
- Date: Sat, 3 Oct 2020 14:24:59 GMT
- Title: Computational Separation Between Convolutional and Fully-Connected
Networks
- Authors: Eran Malach, Shai Shalev-Shwartz
- Abstract summary: We show how convolutional networks can leverage locality in the data, and thus achieve a computational advantage over fully-connected networks.
Specifically, we show a class of problems that can be efficiently solved using convolutional networks trained with gradient-descent.
- Score: 35.39956227364153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks (CNN) exhibit unmatched performance in a
multitude of computer vision tasks. However, the advantage of using
convolutional networks over fully-connected networks is not understood from a
theoretical perspective. In this work, we show how convolutional networks can
leverage locality in the data, and thus achieve a computational advantage over
fully-connected networks. Specifically, we show a class of problems that can be
efficiently solved using convolutional networks trained with gradient-descent,
but at the same time is hard to learn using a polynomial-size fully-connected
network.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Local Kernel Renormalization as a mechanism for feature learning in
overparametrized Convolutional Neural Networks [0.0]
Empirical evidence shows that fully-connected neural networks in the infinite-width limit eventually outperform their finite-width counterparts.
State-of-the-art architectures with convolutional layers achieve optimal performances in the finite-width regime.
We show that the generalization performance of a finite-width FC network can be obtained by an infinite-width network, with a suitable choice of the Gaussian priors.
arXiv Detail & Related papers (2023-07-21T17:22:04Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Multirate Training of Neural Networks [0.0]
We show that for various transfer learning applications in vision and NLP we can fine-tune deep neural networks in almost half the time.
We propose an additional multirate technique which can learn different features present in the data by training the full network on different time scales simultaneously.
arXiv Detail & Related papers (2021-06-20T22:44:55Z) - The Principles of Deep Learning Theory [19.33681537640272]
This book develops an effective theory approach to understanding deep neural networks of practical relevance.
We explain how these effectively-deep networks learn nontrivial representations from training.
We show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks.
arXiv Detail & Related papers (2021-06-18T15:00:00Z) - Learning Autonomy in Management of Wireless Random Networks [102.02142856863563]
This paper presents a machine learning strategy that tackles a distributed optimization task in a wireless network with an arbitrary number of randomly interconnected nodes.
We develop a flexible deep neural network formalism termed distributed message-passing neural network (DMPNN) with forward and backward computations independent of the network topology.
arXiv Detail & Related papers (2021-06-15T09:03:28Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Disentangling Trainability and Generalization in Deep Neural Networks [45.15453323967438]
We analyze the spectrum of the Neural Tangent Kernel (NTK) for trainability and generalization across a range of networks.
We find that CNNs without global average pooling behave almost identically to FCNs, but that CNNs with pooling have markedly different and often better generalization performance.
arXiv Detail & Related papers (2019-12-30T18:53:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.