Related papers: On the Number of Regions of Piecewise Linear Neural Networks

On the Number of Regions of Piecewise Linear Neural Networks

URL: http://arxiv.org/abs/2206.08615v2
Date: Wed, 20 Dec 2023 16:47:57 GMT
Title: On the Number of Regions of Piecewise Linear Neural Networks
Authors: Alexis Goujon, Arian Etemadi and Michael Unser
Abstract summary: Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings. The number of these so-called linear regions offers a natural metric to characterize the expressiveness of CPWL NNs. We introduce a complementary framework to estimate the average number of linear regions produced by a CPWL NN.
Score: 16.78532039510369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings. Specifically, they partition the input domain into regions on which the mapping is affine. The number of these so-called linear regions offers a natural metric to characterize the expressiveness of CPWL NNs. The precise determination of this quantity is often out of reach in practice, and bounds have been proposed for specific architectures, including for ReLU and Maxout NNs. In this work, we generalize these bounds to NNs with arbitrary and possibly multivariate CPWL activation functions. We first provide upper and lower bounds on the maximal number of linear regions of a CPWL NN given its depth, width, and the number of linear regions of its activation functions. Our results rely on the combinatorial structure of convex partitions and confirm the distinctive role of depth which, on its own, is able to exponentially increase the number of regions. We then introduce a complementary stochastic framework to estimate the average number of linear regions produced by a CPWL NN. Under reasonable assumptions, the expected density of linear regions along any 1D path is bounded by the product of depth, width, and a measure of activation complexity (up to a scaling factor). This yields an identical role to the three sources of expressiveness: no exponential growth with depth is observed anymore.

Related papers

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH) When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction. These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks [49.870593940818715]
We study the infinite-width limit of a type of three-layer NN model whose first layer is random and fixed. Our theory accommodates different scaling choices of the model, resulting in two regimes of the MF limit that demonstrate distinctive behaviors.
arXiv Detail & Related papers (2022-10-28T17:26:27Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
Lower and Upper Bounds for Numbers of Linear Regions of Graph Convolutional Networks [11.338307976409707]
The number of linear regions has been considered a good measure for the expressivity of neural networks with piecewise linear activation. We present some estimates for the number of linear regions of the classic graph convolutional networks (GCNs) with one layer and multiple-layer scenarios.
arXiv Detail & Related papers (2022-06-01T04:32:23Z)
A General Computational Framework to Measure the Expressiveness of Complex Networks Using a Tighter Upper Bound of Linear Regions [13.030269373463168]
The upper bound of regions number partitioned by a rec-tifier network, instead of the number itself, is a more practical measurement ofexpressiveness of a DNN. We propose a new and tighter up-per bound regions number for any network structures. Our experiments show our upper bound is tighterthan existing ones, and explain why skip connections and residual structures can improve network performance.
arXiv Detail & Related papers (2020-12-08T14:01:20Z)
On the Number of Linear Functions Composing Deep Neural Network: Towards a Refined Definition of Neural Networks Complexity [6.252236971703546]
We introduce an equivalence relation among the linear functions composing a piecewise linear function and then count those linear functions relative to that equivalence relation. Our new complexity measure can clearly distinguish between the two models, is consistent with the classical measure, and increases exponentially with depth.
arXiv Detail & Related papers (2020-10-23T01:46:12Z)
Bounding The Number of Linear Regions in Local Area for Neural Networks with ReLU Activations [6.4817648240626005]
We present the first method to estimate the upper bound of the number of linear regions in any sphere in the input space of a given ReLU neural network. Our experiments showed that, while training a neural network, the boundaries of the linear regions tend to move away from the training data points.
arXiv Detail & Related papers (2020-07-14T04:06:00Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
Empirical Studies on the Properties of Linear Regions in Deep Neural Networks [34.08593191989188]
A deep neural network (DNN) with piecewise linear activations can partition the input space into numerous small linear regions. It is believed that the number of these regions represents the expressivity of the DNN. We study their local properties, such as the inspheres, the directions of the corresponding hyperplanes, the decision boundaries, and the relevance of the surrounding regions.
arXiv Detail & Related papers (2020-01-04T12:47:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.