A Comprehensive Survey and Performance Analysis of Activation Functions
in Deep Learning
- URL: http://arxiv.org/abs/2109.14545v1
- Date: Wed, 29 Sep 2021 16:41:19 GMT
- Title: A Comprehensive Survey and Performance Analysis of Activation Functions
in Deep Learning
- Authors: Shiv Ram Dubey, Satish Kumar Singh, Bidyut Baran Chaudhuri
- Abstract summary: Various types of neural networks have been introduced to deal with different types of problems.
The main goal of any neural network is to transform the non-linearly separable input data into more linearly separable abstract features.
The most popular and common non-linearity layers are activation functions (AFs), such as Logistic Sigmoid, Tanh, ReLU, ELU, Swish and Mish.
- Score: 23.83339228535986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks have shown tremendous growth in recent years to solve
numerous problems. Various types of neural networks have been introduced to
deal with different types of problems. However, the main goal of any neural
network is to transform the non-linearly separable input data into more
linearly separable abstract features using a hierarchy of layers. These layers
are combinations of linear and nonlinear functions. The most popular and common
non-linearity layers are activation functions (AFs), such as Logistic Sigmoid,
Tanh, ReLU, ELU, Swish and Mish. In this paper, a comprehensive overview and
survey is presented for AFs in neural networks for deep learning. Different
classes of AFs such as Logistic Sigmoid and Tanh based, ReLU based, ELU based,
and Learning based are covered. Several characteristics of AFs such as output
range, monotonicity, and smoothness are also pointed out. A performance
comparison is also performed among 18 state-of-the-art AFs with different
networks on different types of data. The insights of AFs are presented to
benefit the researchers for doing further research and practitioners to select
among different choices. The code used for experimental comparison is released
at: \url{https://github.com/shivram1987/ActivationFunctions}.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Optimizing cnn-Bigru performance: Mish activation and comparative analysis with Relu [0.0]
Activation functions (AF) are fundamental components within neural networks, enabling them to capture complex patterns and relationships in the data.
This study illuminates the effectiveness of AF in elevating the performance of intrusion detection systems.
arXiv Detail & Related papers (2024-05-30T21:48:56Z) - Hidden Classification Layers: Enhancing linear separability between
classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach.
We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z) - ASU-CNN: An Efficient Deep Architecture for Image Classification and
Feature Visualizations [0.0]
Activation functions play a decisive role in determining the capacity of Deep Neural Networks.
In this paper, a Convolutional Neural Network model named as ASU-CNN is proposed.
The network achieved promising results on both training and testing data for the classification of CIFAR-10.
arXiv Detail & Related papers (2023-05-28T16:52:25Z) - Supervised Feature Selection with Neuron Evolution in Sparse Neural
Networks [17.12834153477201]
We propose a novel resource-efficient supervised feature selection method using sparse neural networks.
By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently.
NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models.
arXiv Detail & Related papers (2023-03-10T17:09:55Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Evolution of Activation Functions for Deep Learning-Based Image
Classification [0.0]
Activation functions (AFs) play a pivotal role in the performance of neural networks.
We propose a novel, three-population, coevolutionary algorithm to evolve AFs.
Tested on four datasets -- MNIST, FashionMNIST, KMNIST, and USPS -- coevolution proves to be a performant algorithm for finding good AFs and AF architectures.
arXiv Detail & Related papers (2022-06-24T05:58:23Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Comparisons among different stochastic selection of activation layers
for convolutional neural networks for healthcare [77.99636165307996]
We classify biomedical images using ensembles of neural networks.
We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish, Mish, Mexican Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign.
arXiv Detail & Related papers (2020-11-24T01:53:39Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.