A simple geometric proof for the benefit of depth in ReLU networks
- URL: http://arxiv.org/abs/2101.07126v1
- Date: Mon, 18 Jan 2021 15:40:27 GMT
- Title: A simple geometric proof for the benefit of depth in ReLU networks
- Authors: Asaf Amrami and Yoav Goldberg
- Abstract summary: We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation")
We present a concrete neural network with linear depth (in $m$) and small constant width ($leq 4$) that classifies the problem with zero error.
- Score: 57.815699322370826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a simple proof for the benefit of depth in multi-layer feedforward
network with rectified activation ("depth separation"). Specifically we present
a sequence of classification problems indexed by $m$ such that (a) for any
fixed depth rectified network there exist an $m$ above which classifying
problem $m$ correctly requires exponential number of parameters (in $m$); and
(b) for any problem in the sequence, we present a concrete neural network with
linear depth (in $m$) and small constant width ($\leq 4$) that classifies the
problem with zero error.
The constructive proof is based on geometric arguments and a space folding
construction.
While stronger bounds and results exist, our proof uses substantially simpler
tools and techniques, and should be accessible to undergraduate students in
computer science and people with similar backgrounds.
Related papers
- Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements.
We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Implicit Hypersurface Approximation Capacity in Deep ReLU Networks [0.0]
We develop a geometric approximation theory for deep feed-forward neural networks with ReLU activations.
We show that a deep fully-connected ReLU network of width $d+1$ can implicitly construct an approximation as its zero contour.
arXiv Detail & Related papers (2024-07-04T11:34:42Z) - Bayesian Inference with Deep Weakly Nonlinear Networks [57.95116787699412]
We show at a physics level of rigor that Bayesian inference with a fully connected neural network is solvable.
We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature.
arXiv Detail & Related papers (2024-05-26T17:08:04Z) - Fundamental computational limits of weak learnability in high-dimensional multi-index models [30.501140910531017]
This paper focuses on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms.
Our findings unfold in three parts: (i) we identify under which conditions a trivial subspace can be learned with a single step of a first-order algorithm for any $alpha!>!0$; (ii) if the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an easy subspace.
In a limited but interesting set of really hard directions -- akin to the parity problem -- $alpha_c$ is found
arXiv Detail & Related papers (2024-05-24T11:59:02Z) - Rosenblatt's first theorem and frugality of deep learning [0.0]
Rosenblatt's theorem about omnipotence of shallow networks states that elementary perceptrons can solve any classification problem if there are no discrepancies in the training set.
Minsky and Papert considered elementary perceptrons with restrictions on the neural inputs a bounded number of connections or a relatively small diameter of the receptive field for each neuron at the hidden layer.
In this note, we demonstrated first Rosenblatt's theorem at work, showed how an elementary perceptron can solve a version of the travel maze problem, and analysed the complexity of that solution.
arXiv Detail & Related papers (2022-08-29T09:44:27Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Exploring the Common Principal Subspace of Deep Features in Neural
Networks [50.37178960258464]
We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces.
Specifically, we design a new metric $mathcalP$-vector to represent the principal subspace of deep features learned in a DNN.
Small angles (with cosine close to $1.0$) have been found in the comparisons between any two DNNs trained with different algorithms/architectures.
arXiv Detail & Related papers (2021-10-06T15:48:32Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Topology of deep neural networks [8.946655323517092]
We study how the topology of a data set $M = M_a cup M_b subseteq mathbbRd$ changes as it passes through the layers of a well-trained neural network.
arXiv Detail & Related papers (2020-04-13T17:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.