Stochastic Adaptive Activation Function
- URL: http://arxiv.org/abs/2210.11672v1
- Date: Fri, 21 Oct 2022 01:57:25 GMT
- Title: Stochastic Adaptive Activation Function
- Authors: Kyungsu Lee, Jaeseung Yang, Haeyun Lee, and Jae Youn Hwang
- Abstract summary: This study proposes a simple yet effective activation function that facilitates different thresholds and adaptive activations according to the positions of units and the contexts of inputs.
Experimental analysis demonstrates that our activation function can provide the benefits of more accurate prediction and earlier convergence in many deep learning applications.
- Score: 1.9199289015460212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The simulation of human neurons and neurotransmission mechanisms has been
realized in deep neural networks based on the theoretical implementations of
activation functions. However, recent studies have reported that the threshold
potential of neurons exhibits different values according to the locations and
types of individual neurons, and that the activation functions have limitations
in terms of representing this variability. Therefore, this study proposes a
simple yet effective activation function that facilitates different thresholds
and adaptive activations according to the positions of units and the contexts
of inputs. Furthermore, the proposed activation function mathematically
exhibits a more generalized form of Swish activation function, and thus we
denoted it as Adaptive SwisH (ASH). ASH highlights informative features that
exhibit large values in the top percentiles in an input, whereas it rectifies
low values. Most importantly, ASH exhibits trainable, adaptive, and
context-aware properties compared to other activation functions. Furthermore,
ASH represents general formula of the previously studied activation function
and provides a reasonable mathematical background for the superior performance.
To validate the effectiveness and robustness of ASH, we implemented ASH into
many deep learning models for various tasks, including classification,
detection, segmentation, and image generation. Experimental analysis
demonstrates that our activation function can provide the benefits of more
accurate prediction and earlier convergence in many deep learning applications.
Related papers
- Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features [115.33889811527533]
Diffusion models are initially designed for image generation.
Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks.
arXiv Detail & Related papers (2024-10-04T16:05:14Z) - Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes [70.66864668709677]
We consider the problem of active learning for global sensitivity analysis of expensive black-box functions.
Since function evaluations are expensive, we use active learning to prioritize experimental resources where they yield the most value.
We propose novel active learning acquisition functions that directly target key quantities of derivative-based global sensitivity measures.
arXiv Detail & Related papers (2024-07-13T01:41:12Z) - TSSR: A Truncated and Signed Square Root Activation Function for Neural
Networks [5.9622541907827875]
We introduce a new activation function called the Truncated and Signed Square Root (TSSR) function.
This function is distinctive because it is odd, nonlinear, monotone and differentiable.
It has the potential to improve the numerical stability of neural networks.
arXiv Detail & Related papers (2023-08-09T09:40:34Z) - GELU Activation Function in Deep Learning: A Comprehensive Mathematical
Analysis and Performance [2.458437232470188]
We investigate the differentiability, boundedness, stationarity, and smoothness properties of the GELU activation function.
Our results demonstrate the superior performance of GELU compared to other activation functions.
arXiv Detail & Related papers (2023-05-20T03:22:43Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - A Neural Active Inference Model of Perceptual-Motor Learning [62.39667564455059]
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience.
In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans.
We present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy.
arXiv Detail & Related papers (2022-11-16T20:00:38Z) - How important are activation functions in regression and classification?
A survey, performance comparison, and future directions [0.0]
We survey the activation functions that have been employed in the past as well as the current state-of-the-art.
In recent years, a physics-informed machine learning framework has emerged for solving problems related to scientific computations.
arXiv Detail & Related papers (2022-09-06T17:51:52Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - A survey on modern trainable activation functions [0.0]
We propose a taxonomy of trainable activation functions and highlight common and distinctive proprieties of recent and past models.
We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions.
arXiv Detail & Related papers (2020-05-02T12:38:43Z) - Towards Efficient Processing and Learning with Spikes: New Approaches
for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks.
In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented.
Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z) - Evolutionary Optimization of Deep Learning Activation Functions [15.628118691027328]
We show that evolutionary algorithms can discover novel activation functions that outperform the Rectified Linear Unit (ReLU)
replacing ReLU with evolved activation functions results in statistically significant increases in network accuracy.
These novel activation functions are shown to generalize, achieving high performance across tasks.
arXiv Detail & Related papers (2020-02-17T19:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.