TSSR: A Truncated and Signed Square Root Activation Function for Neural
Networks
- URL: http://arxiv.org/abs/2308.04832v1
- Date: Wed, 9 Aug 2023 09:40:34 GMT
- Title: TSSR: A Truncated and Signed Square Root Activation Function for Neural
Networks
- Authors: Yuanhao Gong
- Abstract summary: We introduce a new activation function called the Truncated and Signed Square Root (TSSR) function.
This function is distinctive because it is odd, nonlinear, monotone and differentiable.
It has the potential to improve the numerical stability of neural networks.
- Score: 5.9622541907827875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions are essential components of neural networks. In this
paper, we introduce a new activation function called the Truncated and Signed
Square Root (TSSR) function. This function is distinctive because it is odd,
nonlinear, monotone and differentiable. Its gradient is continuous and always
positive. Thanks to these properties, it has the potential to improve the
numerical stability of neural networks. Several experiments confirm that the
proposed TSSR has better performance than other stat-of-the-art activation
functions. The proposed function has significant implications for the
development of neural network models and can be applied to a wide range of
applications in fields such as computer vision, natural language processing,
and speech recognition.
Related papers
- Linear Oscillation: A Novel Activation Function for Vision Transformer [0.0]
We present the Linear Oscillation (LoC) activation function, defined as $f(x) = x times sin(alpha x + beta)$.
Distinct from conventional activation functions which primarily introduce non-linearity, LoC seamlessly blends linear trajectories with oscillatory deviations.
Our empirical studies reveal that, when integrated into diverse neural architectures, the LoC activation function consistently outperforms established counterparts like ReLU and Sigmoid.
arXiv Detail & Related papers (2023-08-25T20:59:51Z) - STL: A Signed and Truncated Logarithm Activation Function for Neural
Networks [5.9622541907827875]
Activation functions play an essential role in neural networks.
We present a novel signed and truncated logarithm function as activation function.
The suggested activation function can be applied in a large range of neural networks.
arXiv Detail & Related papers (2023-07-31T03:41:14Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.
Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost.
We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z) - Evaluating CNN with Oscillatory Activation Function [0.0]
CNNs capability to learn high-dimensional complex features from the images is the non-linearity introduced by the activation function.
This paper explores the performance of one of the CNN architecture ALexNet on MNIST and CIFAR10 datasets using oscillating activation function (GCU) and some other commonly used activation functions like ReLu, PReLu, and Mish.
arXiv Detail & Related papers (2022-11-13T11:17:13Z) - Nish: A Novel Negative Stimulated Hybrid Activation Function [5.482532589225552]
We propose a novel non-monotonic activation function called Negative Stimulated Hybrid Activation Function (Nish)
It behaves like a Rectified Linear Unit (ReLU) function for values greater than zero, and a sinus-sigmoidal function for values less than zero.
The proposed function incorporates the sigmoid and sine wave, allowing new dynamics over traditional ReLU activations.
arXiv Detail & Related papers (2022-10-17T13:32:52Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Growing Cosine Unit: A Novel Oscillatory Activation Function That Can
Speedup Training and Reduce Parameters in Convolutional Neural Networks [0.1529342790344802]
Convolution neural networks have been successful in solving many socially important and economically significant problems.
Key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function.
New activation function C(z) = z cos z outperforms Sigmoids, Swish, Mish and ReLU on a variety of architectures.
arXiv Detail & Related papers (2021-08-30T01:07:05Z) - UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function.
We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.