Related papers: Activation function dependence of the storage capacity of treelike neural networks

Activation function dependence of the storage capacity of treelike neural networks

URL: http://arxiv.org/abs/2007.11136v3
Date: Thu, 4 Feb 2021 19:06:18 GMT
Title: Activation function dependence of the storage capacity of treelike neural networks
Authors: Jacob A. Zavatone-Veth and Cengiz Pehlevan
Abstract summary: nonlinear activation functions have been proposed for use in artificial neural networks. We study how activation functions affect the storage capacity of treelike two-layer networks.
Score: 16.244541005112747
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The expressive power of artificial neural networks crucially depends on the nonlinearity of their activation functions. Though a wide variety of nonlinear activation functions have been proposed for use in artificial neural networks, a detailed understanding of their role in determining the expressive power of a network has not emerged. Here, we study how activation functions affect the storage capacity of treelike two-layer networks. We relate the boundedness or divergence of the capacity in the infinite-width limit to the smoothness of the activation function, elucidating the relationship between previously studied special cases. Our results show that nonlinearity can both increase capacity and decrease the robustness of classification, and provide simple estimates for the capacity of networks with several commonly used activation functions. Furthermore, they generate a hypothesis for the functional benefit of dendritic spikes in branched neurons.

Related papers

A More Accurate Approximation of Activation Function with Few Spikes Neurons [6.306126887439676]
spiking neural networks (SNNs) have attracted lots of attention as energy-efficient neural networks. conventional spiking neurons, such as leaky integrate-and-fire neurons, cannot accurately represent complex non-linear activation functions.
arXiv Detail & Related papers (2024-08-19T02:08:56Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
STL: A Signed and Truncated Logarithm Activation Function for Neural Networks [5.9622541907827875]
Activation functions play an essential role in neural networks. We present a novel signed and truncated logarithm function as activation function. The suggested activation function can be applied in a large range of neural networks.
arXiv Detail & Related papers (2023-07-31T03:41:14Z)
Approximating nonlinear functions with latent boundaries in low-rank excitatory-inhibitory spiking networks [5.955727366271805]
We put forth a new framework for spike-based excitatory-inhibitory spiking networks. Our work proposes a new perspective on spiking networks that may serve as a starting point for a mechanistic understanding of biological spike-based computation.
arXiv Detail & Related papers (2023-07-18T15:17:00Z)
Points of non-linearity of functions generated by random neural networks [0.0]
We consider functions from the real numbers to the real numbers, output by a neural network with 1 hidden activation layer, arbitrary width, and ReLU activation function. We compute the expected distribution of the points of non-linearity.
arXiv Detail & Related papers (2023-04-19T17:40:19Z)
Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs) Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space. This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z)
Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks. Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z)
Adaptive Rational Activations to Boost Deep Reinforcement Learning [68.10769262901003]
We motivate why rationals are suitable for adaptable activation functions and why their inclusion into neural networks is crucial. We demonstrate that equipping popular algorithms with (recurrent-)rational activations leads to consistent improvements on Atari games.
arXiv Detail & Related papers (2021-02-18T14:53:12Z)
And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks. We define AND-like neurons and propose measures to increase their proportion in the network. Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations [0.3441021278275805]
We shed light on the roles of single neurons and groups of neurons within the network fulfilling a learned task. We find that neither a neuron's magnitude or selectivity of activation, nor its impact on network performance are sufficient stand-alone indicators.
arXiv Detail & Related papers (2020-04-02T20:45:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.