Review and Comparison of Commonly Used Activation Functions for Deep
Neural Networks
- URL: http://arxiv.org/abs/2010.09458v1
- Date: Thu, 15 Oct 2020 11:09:34 GMT
- Title: Review and Comparison of Commonly Used Activation Functions for Deep
Neural Networks
- Authors: Tomasz Szanda{\l}a
- Abstract summary: It is critical to choose the most appropriate activation function in neural networks calculation.
This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The primary neural networks decision-making units are activation functions.
Moreover, they evaluate the output of networks neural node; thus, they are
essential for the performance of the whole network. Hence, it is critical to
choose the most appropriate activation function in neural networks calculation.
Acharya et al. (2018) suggest that numerous recipes have been formulated over
the years, though some of them are considered deprecated these days since they
are unable to operate properly under some conditions. These functions have a
variety of characteristics, which are deemed essential to successfully
learning. Their monotonicity, individual derivatives, and finite of their range
are some of these characteristics (Bach 2017). This research paper will
evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid,
and so forth. This will be followed by their properties, own cons and pros, and
particular formula application recommendations.
Related papers
- Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features [115.33889811527533]
Diffusion models are initially designed for image generation.
Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks.
arXiv Detail & Related papers (2024-10-04T16:05:14Z) - Activations Through Extensions: A Framework To Boost Performance Of Neural Networks [6.302159507265204]
Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs.
We propose a framework that unifies several works on activation functions and theoretically explains the performance benefits of these works.
arXiv Detail & Related papers (2024-08-07T07:36:49Z) - TSSR: A Truncated and Signed Square Root Activation Function for Neural
Networks [5.9622541907827875]
We introduce a new activation function called the Truncated and Signed Square Root (TSSR) function.
This function is distinctive because it is odd, nonlinear, monotone and differentiable.
It has the potential to improve the numerical stability of neural networks.
arXiv Detail & Related papers (2023-08-09T09:40:34Z) - STL: A Signed and Truncated Logarithm Activation Function for Neural
Networks [5.9622541907827875]
Activation functions play an essential role in neural networks.
We present a novel signed and truncated logarithm function as activation function.
The suggested activation function can be applied in a large range of neural networks.
arXiv Detail & Related papers (2023-07-31T03:41:14Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Activation Functions in Artificial Neural Networks: A Systematic
Overview [0.3553493344868413]
Activation functions shape the outputs of artificial neurons.
This paper provides an analytic yet up-to-date overview of popular activation functions and their properties.
arXiv Detail & Related papers (2021-01-25T08:55:26Z) - A Use of Even Activation Functions in Neural Networks [0.35172332086962865]
We propose an alternative approach to integrate existing knowledge or hypotheses of data structure by constructing custom activation functions.
We show that using an even activation function in one of the fully connected layers improves neural network performance.
arXiv Detail & Related papers (2020-11-23T20:33:13Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.