How important are activation functions in regression and classification?
A survey, performance comparison, and future directions
- URL: http://arxiv.org/abs/2209.02681v2
- Date: Wed, 7 Sep 2022 15:57:52 GMT
- Title: How important are activation functions in regression and classification?
A survey, performance comparison, and future directions
- Authors: Ameya D. Jagtap and George Em Karniadakis
- Abstract summary: We survey the activation functions that have been employed in the past as well as the current state-of-the-art.
In recent years, a physics-informed machine learning framework has emerged for solving problems related to scientific computations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inspired by biological neurons, the activation functions play an essential
part in the learning process of any artificial neural network commonly used in
many real-world problems. Various activation functions have been proposed in
the literature for classification as well as regression tasks. In this work, we
survey the activation functions that have been employed in the past as well as
the current state-of-the-art. In particular, we present various developments in
activation functions over the years and the advantages as well as disadvantages
or limitations of these activation functions. We also discuss classical (fixed)
activation functions, including rectifier units, and adaptive activation
functions. In addition to presenting the taxonomy of activation functions based
on characterization, a taxonomy of activation functions based on applications
is also presented. To this end, the systematic comparison of various fixed and
adaptive activation functions is performed for classification data sets such as
the MNIST, CIFAR-10, and CIFAR-100. In recent years, a physics-informed machine
learning framework has emerged for solving problems related to scientific
computations. To this purpose, we also discuss various requirements for
activation functions that have been used in the physics-informed machine
learning framework. Furthermore, various comparisons are made among different
fixed and adaptive activation functions using various machine learning
libraries such as TensorFlow, Pytorch, and JAX.
Related papers
- Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features [115.33889811527533]
Diffusion models are initially designed for image generation.
Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks.
arXiv Detail & Related papers (2024-10-04T16:05:14Z) - Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes [70.66864668709677]
We consider the problem of active learning for global sensitivity analysis of expensive black-box functions.
Since function evaluations are expensive, we use active learning to prioritize experimental resources where they yield the most value.
We propose novel active learning acquisition functions that directly target key quantities of derivative-based global sensitivity measures.
arXiv Detail & Related papers (2024-07-13T01:41:12Z) - Evaluating CNN with Oscillatory Activation Function [0.0]
CNNs capability to learn high-dimensional complex features from the images is the non-linearity introduced by the activation function.
This paper explores the performance of one of the CNN architecture ALexNet on MNIST and CIFAR10 datasets using oscillating activation function (GCU) and some other commonly used activation functions like ReLu, PReLu, and Mish.
arXiv Detail & Related papers (2022-11-13T11:17:13Z) - Stochastic Adaptive Activation Function [1.9199289015460212]
This study proposes a simple yet effective activation function that facilitates different thresholds and adaptive activations according to the positions of units and the contexts of inputs.
Experimental analysis demonstrates that our activation function can provide the benefits of more accurate prediction and earlier convergence in many deep learning applications.
arXiv Detail & Related papers (2022-10-21T01:57:25Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Evolution of Activation Functions: An Empirical Investigation [0.30458514384586394]
This work presents an evolutionary algorithm to automate the search for completely new activation functions.
We compare these new evolved activation functions to other existing and commonly used activation functions.
arXiv Detail & Related papers (2021-05-30T20:08:20Z) - Comparisons among different stochastic selection of activation layers
for convolutional neural networks for healthcare [77.99636165307996]
We classify biomedical images using ensembles of neural networks.
We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish, Mish, Mexican Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign.
arXiv Detail & Related papers (2020-11-24T01:53:39Z) - Discovering Parametric Activation Functions [17.369163074697475]
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance.
Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective.
arXiv Detail & Related papers (2020-06-05T00:25:33Z) - A survey on modern trainable activation functions [0.0]
We propose a taxonomy of trainable activation functions and highlight common and distinctive proprieties of recent and past models.
We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions.
arXiv Detail & Related papers (2020-05-02T12:38:43Z) - Towards Efficient Processing and Learning with Spikes: New Approaches
for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks.
In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented.
Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z) - Learning Class Regularized Features for Action Recognition [68.90994813947405]
We introduce a novel method named Class Regularization that performs class-based regularization of layer activations.
We show that using Class Regularization blocks in state-of-the-art CNN architectures for action recognition leads to systematic improvement gains of 1.8%, 1.2% and 1.4% on the Kinetics, UCF-101 and HMDB-51 datasets, respectively.
arXiv Detail & Related papers (2020-02-07T07:27:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.