Related papers: Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes

Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes

URL: http://arxiv.org/abs/2302.12553v1
Date: Fri, 24 Feb 2023 10:14:53 GMT
Title: Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes
Authors: Christian Haase, Christoph Hertrich, Georg Loho
Abstract summary: We show that $lceillog_(n)rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry.
Score: 3.0079490585515343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We prove that the set of functions representable by ReLU neural networks with integer weights strictly increases with the network depth while allowing arbitrary width. More precisely, we show that $\lceil\log_2(n)\rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers, matching known upper bounds. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry. The integrality assumption implies that these Newton polytopes are lattice polytopes. Then, our depth lower bounds follow from a parity argument on the normalized volume of faces of such polytopes.

Related papers

Depth-Bounds for Neural Networks via the Braid Arrangement [5.127394801557798]
We prove a non-constant lower bound of $Omega(loglog d)$ hidden layers required to represent the maximum of $d$ numbers. We show that a rank-3 maxout layer followed by a rank-2 maxout layer is sufficient to represent the maximum of 7 numbers.
arXiv Detail & Related papers (2025-02-13T13:37:52Z)
Order Theory in the Context of Machine Learning [0.0]
IVNN refers to a network with integer weights but real biases. For every poset with $n$ points, there exists a corresponding order polytope. We show how posets with four points induce neural networks that can be interpreted as $2times 2$ convolutional filters.
arXiv Detail & Related papers (2024-12-08T22:57:41Z)
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds [75.51968172401394]
We study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN) In each iteration of NPMD, both the value function and the policy can be well approximated by CNNs. We show that NPMD can leverage the low-dimensional structure of state space to escape from the curse of dimensionality.
arXiv Detail & Related papers (2023-09-25T07:31:22Z)
How Many Neurons Does it Take to Approximate the Maximum? [10.995895410470279]
We study the size of a neural network needed to approximate the maximum function over $d$ inputs. We provide new lower and upper bounds on the width required for approximation across various depths.
arXiv Detail & Related papers (2023-07-18T12:47:35Z)
Neural Polytopes [0.0]
We find that simple neural networks with ReLU activation generate polytopes as an approximation of a unit sphere in various dimensions. For a variety of activation functions, generalization of polytopes is obtained, which we call neural polytopes.
arXiv Detail & Related papers (2023-07-03T03:00:22Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
Deep ReLU Networks Have Surprisingly Simple Polytopes [30.417072051872726]
A ReLU network is a piecewise linear function over polytopes. We study the shapes of polytopes via the number of faces of the polytope. By characterizing the shape of polytopes, the number of faces can be a novel leverage for other problems.
arXiv Detail & Related papers (2023-05-16T03:51:34Z)
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK) In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Towards Lower Bounds on the Depth of ReLU Neural Networks [7.355977594790584]
We investigate whether the class of exactly representable functions strictly increases by adding more layers. We settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative. We present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.
arXiv Detail & Related papers (2021-05-31T09:49:14Z)
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks [21.13299067136635]
We provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU networks. In the finite-width setting, the network architectures we consider are quite general.
arXiv Detail & Related papers (2020-12-21T19:32:17Z)
Bayesian Deep Ensembles via the Neural Tangent Kernel [49.569912265882124]
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK) We introduce a simple modification to standard deep ensembles training, through addition of a computationally-tractable, randomised and untrainable function to each ensemble member. We prove that our Bayesian deep ensembles make more conservative predictions than standard deep ensembles in the infinite width limit.
arXiv Detail & Related papers (2020-07-11T22:10:52Z)
Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers. We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set. We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.