Related papers: On the Number of Linear Regions of Convolutional Neural Networks

On the Number of Linear Regions of Convolutional Neural Networks

URL: http://arxiv.org/abs/2006.00978v2
Date: Sat, 27 Jun 2020 10:24:11 GMT
Title: On the Number of Linear Regions of Convolutional Neural Networks
Authors: H. Xiong, L. Huang, M. Yu, L. Liu, F. Zhu, and L. Shao
Abstract summary: Deep CNNs have more powerful expressivity than their shallow counterparts, while CNNs have more expressivity than fully-connected NNs per parameter. Our results suggest that deeper CNNs have more powerful expressivity than their shallow counterparts, while CNNs have more expressivity than fully-connected NNs per parameter.
Score: 0.6206641883102021
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One fundamental problem in deep learning is understanding the outstanding performance of deep Neural Networks (NNs) in practice. One explanation for the superiority of NNs is that they can realize a large class of complicated functions, i.e., they have powerful expressivity. The expressivity of a ReLU NN can be quantified by the maximal number of linear regions it can separate its input space into. In this paper, we provide several mathematical results needed for studying the linear regions of CNNs, and use them to derive the maximal and average numbers of linear regions for one-layer ReLU CNNs. Furthermore, we obtain upper and lower bounds for the number of linear regions of multi-layer ReLU CNNs. Our results suggest that deeper CNNs have more powerful expressivity than their shallow counterparts, while CNNs have more expressivity than fully-connected NNs per parameter.

Related papers

Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications. Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z)
CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation. The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime. We prove that deep CNNs adapt to the spatial scale of the target function. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z)
Lower and Upper Bounds for Numbers of Linear Regions of Graph Convolutional Networks [11.338307976409707]
The number of linear regions has been considered a good measure for the expressivity of neural networks with piecewise linear activation. We present some estimates for the number of linear regions of the classic graph convolutional networks (GCNs) with one layer and multiple-layer scenarios.
arXiv Detail & Related papers (2022-06-01T04:32:23Z)
Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks. We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z)
An Alternative Practice of Tropical Convolution to Traditional Convolutional Neural Networks [0.5837881923712392]
We propose a new type of CNNs called Tropical Convolutional Neural Networks (TCNNs) TCNNs are built on tropical convolutions in which the multiplications and additions in conventional convolutional layers are replaced by additions and min/max operations respectively. We show that TCNN can achieve higher expressive power than ordinary convolutional layers on the MNIST and CIFAR10 image data set.
arXiv Detail & Related papers (2021-03-03T00:13:30Z)
Shape-Tailored Deep Neural Networks [87.55487474723994]
We present Shape-Tailored Deep Neural Networks (ST-DNN) ST-DNN extend convolutional networks (CNN), which aggregate data from fixed shape (square) neighborhoods, to compute descriptors defined on arbitrarily shaped regions. We show that ST-DNN are 3-4 orders of magnitude smaller then CNNs used for segmentation.
arXiv Detail & Related papers (2021-02-16T23:32:14Z)
A General Computational Framework to Measure the Expressiveness of Complex Networks Using a Tighter Upper Bound of Linear Regions [13.030269373463168]
The upper bound of regions number partitioned by a rec-tifier network, instead of the number itself, is a more practical measurement ofexpressiveness of a DNN. We propose a new and tighter up-per bound regions number for any network structures. Our experiments show our upper bound is tighterthan existing ones, and explain why skip connections and residual structures can improve network performance.
arXiv Detail & Related papers (2020-12-08T14:01:20Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.