A Non-monotonic Smooth Activation Function
- URL: http://arxiv.org/abs/2310.10126v1
- Date: Mon, 16 Oct 2023 07:09:47 GMT
- Title: A Non-monotonic Smooth Activation Function
- Authors: Koushik Biswas, Meghana Karri, Ula\c{s} Ba\u{g}c{\i}
- Abstract summary: Activation functions are crucial in deep learning models since they introduce non-linearity into the networks.
In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function.
We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments.
- Score: 4.269446061678759
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Activation functions are crucial in deep learning models since they introduce
non-linearity into the networks, allowing them to learn from errors and make
adjustments, which is essential for learning complex patterns. The essential
purpose of activation functions is to transform unprocessed input signals into
significant output activations, promoting information transmission throughout
the neural network. In this study, we propose a new activation function called
Sqish, which is a non-monotonic and smooth function and an alternative to
existing ones. We showed its superiority in classification, object detection,
segmentation tasks, and adversarial robustness experiments. We got an 8.21%
improvement over ReLU on the CIFAR100 dataset with the ShuffleNet V2 model in
the FGSM adversarial attack. We also got a 5.87% improvement over ReLU on image
classification on the CIFAR100 dataset with the ShuffleNet V2 model.
Related papers
- Activation function optimization method: Learnable series linear units (LSLUs) [12.089173508371246]
We propose a series-based learnable ac-tivation function called LSLU (Learnable Series Linear Units)
This method simplifies deep learning networks while im-proving accuracy.
We evaluate LSLU's performance on CIFAR10, CIFAR100, and specific task datasets (e.g., Silkworm)
arXiv Detail & Related papers (2024-08-28T11:12:27Z) - APALU: A Trainable, Adaptive Activation Function for Deep Learning
Networks [0.0]
We introduce a novel trainable activation function, adaptive piecewise approximated activation linear unit (APALU)
Experiments reveal significant improvements over widely used activation functions for different tasks.
APALU achieves 100% accuracy on a sign language recognition task with a limited dataset.
arXiv Detail & Related papers (2024-02-13T06:18:42Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data.
We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way.
We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z) - SMU: smooth activation function for deep networks using smoothing
maximum technique [1.5267236995686555]
We propose a new novel activation function based on approximation of known activation functions like Leaky ReLU.
We have got 6.22% improvement in the CIFAR100 dataset with the ShuffleNet V2 model.
arXiv Detail & Related papers (2021-11-08T17:54:08Z) - Data-Driven Learning of Feedforward Neural Networks with Different
Activation Functions [0.0]
This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning.
arXiv Detail & Related papers (2021-07-04T18:20:27Z) - Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method.
It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO.
PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z) - Learning Invariant Representations across Domains and Tasks [81.30046935430791]
We propose a novel Task Adaptation Network (TAN) to solve this unsupervised task transfer problem.
In addition to learning transferable features via domain-adversarial training, we propose a novel task semantic adaptor that uses the learning-to-learn strategy to adapt the task semantics.
TAN significantly increases the recall and F1 score by 5.0% and 7.8% compared to recently strong baselines.
arXiv Detail & Related papers (2021-03-03T11:18:43Z) - Dynamic ReLU [74.973224160508]
We propose dynamic ReLU (DY-ReLU), a dynamic input of parameters which are generated by a hyper function over all in-put elements.
Compared to its static counterpart, DY-ReLU has negligible extra computational cost, but significantly more representation capability.
By simply using DY-ReLU for MobileNetV2, the top-1 accuracy on ImageNet classification is boosted from 72.0% to 76.2% with only 5% additional FLOPs.
arXiv Detail & Related papers (2020-03-22T23:45:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.