On the Number of Linear Functions Composing Deep Neural Network: Towards
a Refined Definition of Neural Networks Complexity
- URL: http://arxiv.org/abs/2010.12125v2
- Date: Thu, 25 Feb 2021 06:22:35 GMT
- Title: On the Number of Linear Functions Composing Deep Neural Network: Towards
a Refined Definition of Neural Networks Complexity
- Authors: Yuuki Takai, Akiyoshi Sannai, Matthieu Cordonnier
- Abstract summary: We introduce an equivalence relation among the linear functions composing a piecewise linear function and then count those linear functions relative to that equivalence relation.
Our new complexity measure can clearly distinguish between the two models, is consistent with the classical measure, and increases exponentially with depth.
- Score: 6.252236971703546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The classical approach to measure the expressive power of deep neural
networks with piecewise linear activations is based on counting their maximum
number of linear regions. This complexity measure is quite relevant to
understand general properties of the expressivity of neural networks such as
the benefit of depth over width. Nevertheless, it appears limited when it comes
to comparing the expressivity of different network architectures. This lack
becomes particularly prominent when considering permutation-invariant networks,
due to the symmetrical redundancy among the linear regions. To tackle this, we
propose a refined definition of piecewise linear function complexity: instead
of counting the number of linear regions directly, we first introduce an
equivalence relation among the linear functions composing a piecewise linear
function and then count those linear functions relative to that equivalence
relation. Our new complexity measure can clearly distinguish between the two
aforementioned models, is consistent with the classical measure, and increases
exponentially with depth.
Related papers
- Spectral complexity of deep neural networks [2.099922236065961]
We use the angular power spectrum of the limiting field to characterize the complexity of the network architecture.
On this basis, we classify neural networks as low-disorder, sparse, or high-disorder.
We show how this classification highlights a number of distinct features for standard activation functions, and in particular, sparsity properties of ReLU networks.
arXiv Detail & Related papers (2024-05-15T17:55:05Z) - An Adaptive Tangent Feature Perspective of Neural Networks [4.900298402690262]
We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear constraint.
Specializing to neural network structure, we gain insights into how the features and thus the kernel function change.
We verify our theoretical observations in the kernel alignment of real neural networks.
arXiv Detail & Related papers (2023-08-29T17:57:20Z) - On the relationship between multivariate splines and infinitely-wide
neural networks [0.0]
We show that the associated function space is a Sobolev space on a Euclidean ball, with an explicit bound on the norms of derivatives.
This random feature expansion is numerically better behaved than usual random Fourier features, both in theory and practice.
In particular, in one dimension, we compare the associated leverage scores to compare the two random expansions and show a better scaling for the neural network expansion.
arXiv Detail & Related papers (2023-02-07T13:29:06Z) - On the Number of Regions of Piecewise Linear Neural Networks [16.78532039510369]
Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings.
The number of these so-called linear regions offers a natural metric to characterize the expressiveness of CPWL NNs.
We introduce a complementary framework to estimate the average number of linear regions produced by a CPWL NN.
arXiv Detail & Related papers (2022-06-17T08:17:28Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - Deep neural networks for smooth approximation of physics with higher
order and continuity B-spline base functions [0.4588028371034407]
Traditionally, the neural network employs non-linear activation functions to approximate a given physical phenomenon.
We present an alternative approach, where the physcial quantity is approximated as a linear combination of smooth B-spline basis functions.
We show that our approach is cheaper and more accurate when approximating physical fields.
arXiv Detail & Related papers (2022-01-03T23:02:39Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.