APALU: A Trainable, Adaptive Activation Function for Deep Learning
Networks
- URL: http://arxiv.org/abs/2402.08244v1
- Date: Tue, 13 Feb 2024 06:18:42 GMT
- Title: APALU: A Trainable, Adaptive Activation Function for Deep Learning
Networks
- Authors: Barathi Subramanian, Rathinaraja Jeyaraj, Rakhmonov Akhrorjon
Akhmadjon Ugli, and Jeonghong Kim
- Abstract summary: We introduce a novel trainable activation function, adaptive piecewise approximated activation linear unit (APALU)
Experiments reveal significant improvements over widely used activation functions for different tasks.
APALU achieves 100% accuracy on a sign language recognition task with a limited dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation function is a pivotal component of deep learning, facilitating the
extraction of intricate data patterns. While classical activation functions
like ReLU and its variants are extensively utilized, their static nature and
simplicity, despite being advantageous, often limit their effectiveness in
specialized tasks. The trainable activation functions also struggle sometimes
to adapt to the unique characteristics of the data. Addressing these
limitations, we introduce a novel trainable activation function, adaptive
piecewise approximated activation linear unit (APALU), to enhance the learning
performance of deep learning across a broad range of tasks. It presents a
unique set of features that enable it to maintain stability and efficiency in
the learning process while adapting to complex data representations.
Experiments reveal significant improvements over widely used activation
functions for different tasks. In image classification, APALU increases
MobileNet and GoogleNet accuracy by 0.37% and 0.04%, respectively, on the
CIFAR10 dataset. In anomaly detection, it improves the average area under the
curve of One-CLASS Deep SVDD by 0.8% on the MNIST dataset, 1.81% and 1.11%
improvements with DifferNet, and knowledge distillation, respectively, on the
MVTech dataset. Notably, APALU achieves 100% accuracy on a sign language
recognition task with a limited dataset. For regression tasks, APALU enhances
the performance of deep neural networks and recurrent neural networks on
different datasets. These improvements highlight the robustness and
adaptability of APALU across diverse deep-learning applications.
Related papers
- Activation function optimization method: Learnable series linear units (LSLUs) [12.089173508371246]
We propose a series-based learnable ac-tivation function called LSLU (Learnable Series Linear Units)
This method simplifies deep learning networks while im-proving accuracy.
We evaluate LSLU's performance on CIFAR10, CIFAR100, and specific task datasets (e.g., Silkworm)
arXiv Detail & Related papers (2024-08-28T11:12:27Z) - BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data.
Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z) - A Non-monotonic Smooth Activation Function [4.269446061678759]
Activation functions are crucial in deep learning models since they introduce non-linearity into the networks.
In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function.
We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments.
arXiv Detail & Related papers (2023-10-16T07:09:47Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - GELU Activation Function in Deep Learning: A Comprehensive Mathematical
Analysis and Performance [2.458437232470188]
We investigate the differentiability, boundedness, stationarity, and smoothness properties of the GELU activation function.
Our results demonstrate the superior performance of GELU compared to other activation functions.
arXiv Detail & Related papers (2023-05-20T03:22:43Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method.
It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO.
PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Discovering Parametric Activation Functions [17.369163074697475]
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance.
Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective.
arXiv Detail & Related papers (2020-06-05T00:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.