Related papers: Rational neural networks

Rational neural networks

URL: http://arxiv.org/abs/2004.01902v2
Date: Wed, 30 Sep 2020 09:16:55 GMT
Title: Rational neural networks
Authors: Nicolas Boull\'e, Yuji Nakatsukasa, Alex Townsend
Abstract summary: We consider neural networks with rational activation functions. We prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth.
Score: 3.4376560669160394
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider neural networks with rational activation functions. The choice of the nonlinear activation function in deep learning architectures is crucial and heavily impacts the performance of a neural network. We establish optimal bounds in terms of network complexity and prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth. The flexibility and smoothness of rational activation functions make them an attractive alternative to ReLU, as we demonstrate with numerical experiments.

Related papers

ReCA: A Parametric ReLU Composite Activation Function [0.0]
Activation functions have been shown to affect the performance of deep neural networks significantly. We propose a novel parametric activation function, ReCA, which has been shown to outperform all baselines on state-of-the-art datasets.
arXiv Detail & Related papers (2025-04-11T22:05:57Z)
Activations Through Extensions: A Framework To Boost Performance Of Neural Networks [6.302159507265204]
Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. We propose a framework that unifies several works on activation functions and theoretically explains the performance benefits of these works.
arXiv Detail & Related papers (2024-08-07T07:36:49Z)
Fractional Concepts in Neural Networks: Enhancing Activation and Loss Functions [0.7614628596146602]
The paper presents a method for using fractional concepts in a neural network to modify the activation and loss functions. This will enable neurons in the network to adjust their activation functions to match input data better and reduce output errors.
arXiv Detail & Related papers (2023-10-18T10:49:29Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Expressivity of Spiking Neural Networks [15.181458163440634]
We study the capabilities of spiking neural networks where information is encoded in the firing time of neurons. In contrast to ReLU networks, we prove that spiking neural networks can realize both continuous and discontinuous functions.
arXiv Detail & Related papers (2023-08-16T08:45:53Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z)
Optimal Learning Rates of Deep Convolutional Neural Networks: Additive Ridge Functions [19.762318115851617]
We consider the mean squared error analysis for deep convolutional neural networks. We show that, for additive ridge functions, convolutional neural networks followed by one fully connected layer with ReLU activation functions can reach optimal mini-max rates.
arXiv Detail & Related papers (2022-02-24T14:22:32Z)
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions [4.932130498861987]
We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions. Under suitable conditions, KNNs induce a faster decay of the loss than that by the feed-forward networks.
arXiv Detail & Related papers (2021-05-20T04:54:57Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.