Related papers: A Non-monotonic Smooth Activation Function

A Non-monotonic Smooth Activation Function

URL: http://arxiv.org/abs/2310.10126v1
Date: Mon, 16 Oct 2023 07:09:47 GMT
Title: A Non-monotonic Smooth Activation Function
Authors: Koushik Biswas, Meghana Karri, Ula\c{s} Ba\u{g}c{\i}
Abstract summary: Activation functions are crucial in deep learning models since they introduce non-linearity into the networks. In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function. We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments.
Score: 4.269446061678759
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Activation functions are crucial in deep learning models since they introduce non-linearity into the networks, allowing them to learn from errors and make adjustments, which is essential for learning complex patterns. The essential purpose of activation functions is to transform unprocessed input signals into significant output activations, promoting information transmission throughout the neural network. In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function and an alternative to existing ones. We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments. We got an 8.21% improvement over ReLU on the CIFAR100 dataset with the ShuffleNet V2 model in the FGSM adversarial attack. We also got a 5.87% improvement over ReLU on image classification on the CIFAR100 dataset with the ShuffleNet V2 model.

Related papers

Activation function optimization method: Learnable series linear units (LSLUs) [12.089173508371246]
We propose a series-based learnable ac-tivation function called LSLU (Learnable Series Linear Units) This method simplifies deep learning networks while im-proving accuracy. We evaluate LSLU's performance on CIFAR10, CIFAR100, and specific task datasets (e.g., Silkworm)
arXiv Detail & Related papers (2024-08-28T11:12:27Z)
APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks [0.0]
We introduce a novel trainable activation function, adaptive piecewise approximated activation linear unit (APALU) Experiments reveal significant improvements over widely used activation functions for different tasks. APALU achieves 100% accuracy on a sign language recognition task with a limited dataset.
arXiv Detail & Related papers (2024-02-13T06:18:42Z)
Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data. RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data. We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way. We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z)
SMU: smooth activation function for deep networks using smoothing maximum technique [1.5267236995686555]
We propose a new novel activation function based on approximation of known activation functions like Leaky ReLU. We have got 6.22% improvement in the CIFAR100 dataset with the ShuffleNet V2 model.
arXiv Detail & Related papers (2021-11-08T17:54:08Z)
Data-Driven Learning of Feedforward Neural Networks with Different Activation Functions [0.0]
This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning.
arXiv Detail & Related papers (2021-07-04T18:20:27Z)
Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method. It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO. PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z)
Learning Invariant Representations across Domains and Tasks [81.30046935430791]
We propose a novel Task Adaptation Network (TAN) to solve this unsupervised task transfer problem. In addition to learning transferable features via domain-adversarial training, we propose a novel task semantic adaptor that uses the learning-to-learn strategy to adapt the task semantics. TAN significantly increases the recall and F1 score by 5.0% and 7.8% compared to recently strong baselines.
arXiv Detail & Related papers (2021-03-03T11:18:43Z)
Dynamic ReLU [74.973224160508]
We propose dynamic ReLU (DY-ReLU), a dynamic input of parameters which are generated by a hyper function over all in-put elements. Compared to its static counterpart, DY-ReLU has negligible extra computational cost, but significantly more representation capability. By simply using DY-ReLU for MobileNetV2, the top-1 accuracy on ImageNet classification is boosted from 72.0% to 76.2% with only 5% additional FLOPs.
arXiv Detail & Related papers (2020-03-22T23:45:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.