Evolutionary Optimization of Deep Learning Activation Functions
- URL: http://arxiv.org/abs/2002.07224v2
- Date: Sat, 11 Apr 2020 15:53:12 GMT
- Title: Evolutionary Optimization of Deep Learning Activation Functions
- Authors: Garrett Bingham, William Macke, and Risto Miikkulainen
- Abstract summary: We show that evolutionary algorithms can discover novel activation functions that outperform the Rectified Linear Unit (ReLU)
replacing ReLU with evolved activation functions results in statistically significant increases in network accuracy.
These novel activation functions are shown to generalize, achieving high performance across tasks.
- Score: 15.628118691027328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The choice of activation function can have a large effect on the performance
of a neural network. While there have been some attempts to hand-engineer novel
activation functions, the Rectified Linear Unit (ReLU) remains the most
commonly-used in practice. This paper shows that evolutionary algorithms can
discover novel activation functions that outperform ReLU. A tree-based search
space of candidate activation functions is defined and explored with mutation,
crossover, and exhaustive search. Experiments on training wide residual
networks on the CIFAR-10 and CIFAR-100 image datasets show that this approach
is effective. Replacing ReLU with evolved activation functions results in
statistically significant increases in network accuracy. Optimal performance is
achieved when evolution is allowed to customize activation functions to a
particular task; however, these novel activation functions are shown to
generalize, achieving high performance across tasks. Evolutionary optimization
of activation functions is therefore a promising new dimension of metalearning
in neural networks.
Related papers
- Activation Function Optimization Scheme for Image Classification [20.531966402727708]
Activation function has a significant impact on the dynamics, convergence, and performance of deep neural networks.
We propose an evolutionary approach for optimizing activation functions specifically for image classification tasks.
We obtain a series of high-performing activation functions denoted as Exponential Error Linear Unit (EELU)
arXiv Detail & Related papers (2024-09-07T21:40:15Z) - A Method on Searching Better Activation Functions [15.180864683908878]
We propose Entropy-based Activation Function Optimization (EAFO) methodology for designing static activation functions in deep neural networks.
We derive a novel activation function from ReLU, known as Correction Regularized ReLU (CRReLU)
arXiv Detail & Related papers (2024-05-19T03:48:05Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Efficient Activation Function Optimization through Surrogate Modeling [15.219959721479835]
This paper aims to improve the state of the art through three steps.
First, the benchmark Act-Bench-CNN, Act-Bench-ResNet, and Act-Bench-ViT were created by training convolutional, residual, and vision transformer architectures.
Second, a characterization of the benchmark space was developed, leading to a new surrogate-based method for optimization.
arXiv Detail & Related papers (2023-01-13T23:11:14Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Efficient Neural Network Analysis with Sum-of-Infeasibilities [64.31536828511021]
Inspired by sum-of-infeasibilities methods in convex optimization, we propose a novel procedure for analyzing verification queries on networks with extensive branching functions.
An extension to a canonical case-analysis-based complete search procedure can be achieved by replacing the convex procedure executed at each search state with DeepSoI.
arXiv Detail & Related papers (2022-03-19T15:05:09Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Evolution of Activation Functions: An Empirical Investigation [0.30458514384586394]
This work presents an evolutionary algorithm to automate the search for completely new activation functions.
We compare these new evolved activation functions to other existing and commonly used activation functions.
arXiv Detail & Related papers (2021-05-30T20:08:20Z) - Discovering Parametric Activation Functions [17.369163074697475]
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance.
Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective.
arXiv Detail & Related papers (2020-06-05T00:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.