Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice
Polytopes
- URL: http://arxiv.org/abs/2302.12553v1
- Date: Fri, 24 Feb 2023 10:14:53 GMT
- Title: Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice
Polytopes
- Authors: Christian Haase, Christoph Hertrich, Georg Loho
- Abstract summary: We show that $lceillog_(n)rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers.
Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry.
- Score: 3.0079490585515343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove that the set of functions representable by ReLU neural networks with
integer weights strictly increases with the network depth while allowing
arbitrary width. More precisely, we show that $\lceil\log_2(n)\rceil$ hidden
layers are indeed necessary to compute the maximum of $n$ numbers, matching
known upper bounds. Our results are based on the known duality between neural
networks and Newton polytopes via tropical geometry. The integrality assumption
implies that these Newton polytopes are lattice polytopes. Then, our depth
lower bounds follow from a parity argument on the normalized volume of faces of
such polytopes.
Related papers
- Sample Complexity of Neural Policy Mirror Descent for Policy
Optimization on Low-Dimensional Manifolds [75.51968172401394]
We study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN)
In each iteration of NPMD, both the value function and the policy can be well approximated by CNNs.
We show that NPMD can leverage the low-dimensional structure of state space to escape from the curse of dimensionality.
arXiv Detail & Related papers (2023-09-25T07:31:22Z) - How Many Neurons Does it Take to Approximate the Maximum? [10.995895410470279]
We study the size of a neural network needed to approximate the maximum function over $d$ inputs.
We provide new lower and upper bounds on the width required for approximation across various depths.
arXiv Detail & Related papers (2023-07-18T12:47:35Z) - Neural Polytopes [0.0]
We find that simple neural networks with ReLU activation generate polytopes as an approximation of a unit sphere in various dimensions.
For a variety of activation functions, generalization of polytopes is obtained, which we call neural polytopes.
arXiv Detail & Related papers (2023-07-03T03:00:22Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Deep ReLU Networks Have Surprisingly Simple Polytopes [22.969361799006734]
A ReLU network is a piecewise linear function over polytopes.
We study the shapes of polytopes via the number of simplices obtained by triangulating the polytope.
By characterizing the shape of polytopes, the number of simplices be a leverage for other problems.
arXiv Detail & Related papers (2023-05-16T03:51:34Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Towards Lower Bounds on the Depth of ReLU Neural Networks [7.355977594790584]
We investigate whether the class of exactly representable functions strictly increases by adding more layers.
We settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative.
We present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.
arXiv Detail & Related papers (2021-05-31T09:49:14Z) - Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for
Deep ReLU Networks [21.13299067136635]
We provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU networks.
In the finite-width setting, the network architectures we consider are quite general.
arXiv Detail & Related papers (2020-12-21T19:32:17Z) - Bayesian Deep Ensembles via the Neural Tangent Kernel [49.569912265882124]
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK)
We introduce a simple modification to standard deep ensembles training, through addition of a computationally-tractable, randomised and untrainable function to each ensemble member.
We prove that our Bayesian deep ensembles make more conservative predictions than standard deep ensembles in the infinite width limit.
arXiv Detail & Related papers (2020-07-11T22:10:52Z) - Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers.
We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set.
We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.