SMU: smooth activation function for deep networks using smoothing
maximum technique
- URL: http://arxiv.org/abs/2111.04682v1
- Date: Mon, 8 Nov 2021 17:54:08 GMT
- Title: SMU: smooth activation function for deep networks using smoothing
maximum technique
- Authors: Koushik Biswas, Sandeep Kumar, Shilpak Banerjee, Ashish Kumar Pandey
- Abstract summary: We propose a new novel activation function based on approximation of known activation functions like Leaky ReLU.
We have got 6.22% improvement in the CIFAR100 dataset with the ShuffleNet V2 model.
- Score: 1.5267236995686555
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning researchers have a keen interest in proposing two new novel
activation functions which can boost network performance. A good choice of
activation function can have significant consequences in improving network
performance. A handcrafted activation is the most common choice in neural
network models. ReLU is the most common choice in the deep learning community
due to its simplicity though ReLU has some serious drawbacks. In this paper, we
have proposed a new novel activation function based on approximation of known
activation functions like Leaky ReLU, and we call this function Smooth Maximum
Unit (SMU). Replacing ReLU by SMU, we have got 6.22% improvement in the
CIFAR100 dataset with the ShuffleNet V2 model.
Related papers
- ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse
LLMs [91.31204876440765]
We introduce a general method that defines neuron activation through neuron output magnitudes and a tailored magnitude threshold.
To find the most efficient activation function for sparse computation, we propose a systematic framework.
We conduct thorough experiments on LLMs utilizing different activation functions, including ReLU, SwiGLU, ReGLU, and ReLU$2$.
arXiv Detail & Related papers (2024-02-06T08:45:51Z) - A Non-monotonic Smooth Activation Function [4.269446061678759]
Activation functions are crucial in deep learning models since they introduce non-linearity into the networks.
In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function.
We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments.
arXiv Detail & Related papers (2023-10-16T07:09:47Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - APTx: better activation function than MISH, SWISH, and ReLU's variants
used in deep learning [0.0]
Activation functions introduce non-linearity in the deep neural networks.
In this paper, we propose an activation function named APTx which behaves similar to MISH, but requires lesser mathematical operations to compute.
arXiv Detail & Related papers (2022-09-10T14:26:04Z) - SAU: Smooth activation function using convolution with approximate
identities [1.5267236995686555]
Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin.
We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities.
arXiv Detail & Related papers (2021-09-27T17:31:04Z) - MicroNet: Improving Image Recognition with Extremely Low FLOPs [82.54764264255505]
We find two factors, sparse connectivity and dynamic activation function, are effective to improve the accuracy.
We present a new dynamic activation function, named Dynamic Shift Max, to improve the non-linearity.
We arrive at a family of networks, named MicroNet, that achieves significant performance gains over the state of the art in the low FLOP regime.
arXiv Detail & Related papers (2021-08-12T17:59:41Z) - CondenseNet V2: Sparse Feature Reactivation for Deep Networks [87.38447745642479]
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency.
We propose an alternative approach named sparse feature reactivation (SFR), aiming at actively increasing the utility of features for reusing.
Our experiments show that the proposed models achieve promising performance on image classification (ImageNet and CIFAR) and object detection (MS COCO) in terms of both theoretical efficiency and practical speed.
arXiv Detail & Related papers (2021-04-09T14:12:43Z) - Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method.
It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO.
PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z) - Comparisons among different stochastic selection of activation layers
for convolutional neural networks for healthcare [77.99636165307996]
We classify biomedical images using ensembles of neural networks.
We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish, Mish, Mexican Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign.
arXiv Detail & Related papers (2020-11-24T01:53:39Z) - Dynamic ReLU [74.973224160508]
We propose dynamic ReLU (DY-ReLU), a dynamic input of parameters which are generated by a hyper function over all in-put elements.
Compared to its static counterpart, DY-ReLU has negligible extra computational cost, but significantly more representation capability.
By simply using DY-ReLU for MobileNetV2, the top-1 accuracy on ImageNet classification is boosted from 72.0% to 76.2% with only 5% additional FLOPs.
arXiv Detail & Related papers (2020-03-22T23:45:35Z) - Evolutionary Optimization of Deep Learning Activation Functions [15.628118691027328]
We show that evolutionary algorithms can discover novel activation functions that outperform the Rectified Linear Unit (ReLU)
replacing ReLU with evolved activation functions results in statistically significant increases in network accuracy.
These novel activation functions are shown to generalize, achieving high performance across tasks.
arXiv Detail & Related papers (2020-02-17T19:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.