Related papers: Memory capacity of three-layer neural networks with non-polynomial activations

Memory capacity of three-layer neural networks with non-polynomial activations

URL: http://arxiv.org/abs/2405.13738v1
Date: Wed, 22 May 2024 15:29:45 GMT
Title: Memory capacity of three-layer neural networks with non-polynomial activations
Authors: Liam Madden,
Abstract summary: We show that $Theta(sqrtn)$ neurons are sufficient as long as the activation function is real at a point and not a point and not a there. This means that activation functions can be freely chosen in a problem-dependent manner without loss of power.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The minimal number of neurons required for a feedforward neural network to interpolate $n$ generic input-output pairs from $\mathbb{R}^d\times \mathbb{R}$ is $\Theta(\sqrt{n})$. While previous results have shown that $\Theta(\sqrt{n})$ neurons are sufficient, they have been limited to logistic, Heaviside, and rectified linear unit (ReLU) as the activation function. Using a different approach, we prove that $\Theta(\sqrt{n})$ neurons are sufficient as long as the activation function is real analytic at a point and not a polynomial there. Thus, the only practical activation functions that our result does not apply to are piecewise polynomials. Importantly, this means that activation functions can be freely chosen in a problem-dependent manner without loss of interpolation power.

Related papers

Linear Independence of Generalized Neurons and Related Functions [0.0]
Linear independence of neurons plays a significant role in theoretical analysis of neural networks. We study the problem for neurons with arbitrary layers and widths, giving a simple but complete characterization for generic analytic activation functions.
arXiv Detail & Related papers (2024-09-22T21:09:15Z)
Optimal Neural Network Approximation for High-Dimensional Continuous Functions [5.748690310135373]
We present a family of continuous functions that requires at least width $d$, and therefore at least $d$ intrinsic neurons, to achieve arbitrary accuracy in its approximation. This shows that the requirement of $mathcalO(d)$ intrinsic neurons is optimal in the sense that it grows linearly with the input dimension $d$.
arXiv Detail & Related papers (2024-09-04T01:18:55Z)
Optimal approximation using complex-valued neural networks [0.0]
Complex-valued neural networks (CVNNs) have recently shown promising empirical success. We analyze the expressivity of CVNNs by studying their approximation properties.
arXiv Detail & Related papers (2023-03-29T15:56:43Z)
Shallow neural network representation of polynomials [91.3755431537592]
We show that $d$-variables of degreeR$ can be represented on $[0,1]d$ as shallow neural networks of width $d+1+sum_r=2Rbinomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1d-1[binomr+d-1d-1d-1d-1
arXiv Detail & Related papers (2022-08-17T08:14:52Z)
Learning a Single Neuron for Non-monotonic Activation Functions [3.890410443467757]
Non-monotonic activation functions outperform the traditional monotonic ones in many applications. We show that mild conditions on $sigma$ are sufficient to guarantee the learnability in samples time. We also discuss how our positive results are related to existing negative results on training two-layer neural networks.
arXiv Detail & Related papers (2022-02-16T13:44:25Z)
Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions. We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
Interval Universal Approximation for Neural Networks [47.767793120249095]
We introduce the interval universal approximation (IUA) theorem. IUA shows that neural networks can approximate any continuous function $f$ as we have known for decades. We study the computational complexity of constructing neural networks that are amenable to precise interval analysis.
arXiv Detail & Related papers (2020-07-12T20:43:56Z)
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK [58.5766737343951]
We consider the dynamic of descent for learning a two-layer neural network. We show that an over-parametrized two-layer neural network can provably learn with gradient loss at most ground with Tangent samples.
arXiv Detail & Related papers (2020-07-09T07:09:28Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.