Evolution of Activation Functions: An Empirical Investigation
- URL: http://arxiv.org/abs/2105.14614v1
- Date: Sun, 30 May 2021 20:08:20 GMT
- Title: Evolution of Activation Functions: An Empirical Investigation
- Authors: Andrew Nader and Danielle Azar
- Abstract summary: This work presents an evolutionary algorithm to automate the search for completely new activation functions.
We compare these new evolved activation functions to other existing and commonly used activation functions.
- Score: 0.30458514384586394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The hyper-parameters of a neural network are traditionally designed through a
time consuming process of trial and error that requires substantial expert
knowledge. Neural Architecture Search (NAS) algorithms aim to take the human
out of the loop by automatically finding a good set of hyper-parameters for the
problem at hand. These algorithms have mostly focused on hyper-parameters such
as the architectural configurations of the hidden layers and the connectivity
of the hidden neurons, but there has been relatively little work on automating
the search for completely new activation functions, which are one of the most
crucial hyper-parameters to choose. There are some widely used activation
functions nowadays which are simple and work well, but nonetheless, there has
been some interest in finding better activation functions. The work in the
literature has mostly focused on designing new activation functions by hand, or
choosing from a set of predefined functions while this work presents an
evolutionary algorithm to automate the search for completely new activation
functions. We compare these new evolved activation functions to other existing
and commonly used activation functions. The results are favorable and are
obtained from averaging the performance of the activation functions found over
30 runs, with experiments being conducted on 10 different datasets and
architectures to ensure the statistical robustness of the study.
Related papers
- Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features [115.33889811527533]
Diffusion models are initially designed for image generation.
Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks.
arXiv Detail & Related papers (2024-10-04T16:05:14Z) - Efficient Search for Customized Activation Functions with Gradient Descent [42.20716255578699]
Different activation functions work best for different deep learning models.
We propose a fine-grained search cell that combines basic mathematical operations to model activation functions.
Our approach enables the identification of specialized activations, leading to improved performance in every model we tried.
arXiv Detail & Related papers (2024-08-13T11:27:31Z) - Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes [70.66864668709677]
We consider the problem of active learning for global sensitivity analysis of expensive black-box functions.
Since function evaluations are expensive, we use active learning to prioritize experimental resources where they yield the most value.
We propose novel active learning acquisition functions that directly target key quantities of derivative-based global sensitivity measures.
arXiv Detail & Related papers (2024-07-13T01:41:12Z) - Evaluating CNN with Oscillatory Activation Function [0.0]
CNNs capability to learn high-dimensional complex features from the images is the non-linearity introduced by the activation function.
This paper explores the performance of one of the CNN architecture ALexNet on MNIST and CIFAR10 datasets using oscillating activation function (GCU) and some other commonly used activation functions like ReLu, PReLu, and Mish.
arXiv Detail & Related papers (2022-11-13T11:17:13Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Discovering Parametric Activation Functions [17.369163074697475]
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance.
Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective.
arXiv Detail & Related papers (2020-06-05T00:25:33Z) - A survey on modern trainable activation functions [0.0]
We propose a taxonomy of trainable activation functions and highlight common and distinctive proprieties of recent and past models.
We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions.
arXiv Detail & Related papers (2020-05-02T12:38:43Z) - Evolving Normalization-Activation Layers [100.82879448303805]
We develop efficient rejection protocols to quickly filter out candidate layers that do not work well.
Our method leads to the discovery of EvoNorms, a set of new normalization-activation layers with novel, and sometimes surprising structures.
Our experiments show that EvoNorms work well on image classification models including ResNets, MobileNets and EfficientNets.
arXiv Detail & Related papers (2020-04-06T19:52:48Z) - Evolutionary Optimization of Deep Learning Activation Functions [15.628118691027328]
We show that evolutionary algorithms can discover novel activation functions that outperform the Rectified Linear Unit (ReLU)
replacing ReLU with evolved activation functions results in statistically significant increases in network accuracy.
These novel activation functions are shown to generalize, achieving high performance across tasks.
arXiv Detail & Related papers (2020-02-17T19:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.