Related papers: APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks

APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks

URL: http://arxiv.org/abs/2402.08244v1
Date: Tue, 13 Feb 2024 06:18:42 GMT
Title: APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks
Authors: Barathi Subramanian, Rathinaraja Jeyaraj, Rakhmonov Akhrorjon Akhmadjon Ugli, and Jeonghong Kim
Abstract summary: We introduce a novel trainable activation function, adaptive piecewise approximated activation linear unit (APALU) Experiments reveal significant improvements over widely used activation functions for different tasks. APALU achieves 100% accuracy on a sign language recognition task with a limited dataset.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Activation function is a pivotal component of deep learning, facilitating the extraction of intricate data patterns. While classical activation functions like ReLU and its variants are extensively utilized, their static nature and simplicity, despite being advantageous, often limit their effectiveness in specialized tasks. The trainable activation functions also struggle sometimes to adapt to the unique characteristics of the data. Addressing these limitations, we introduce a novel trainable activation function, adaptive piecewise approximated activation linear unit (APALU), to enhance the learning performance of deep learning across a broad range of tasks. It presents a unique set of features that enable it to maintain stability and efficiency in the learning process while adapting to complex data representations. Experiments reveal significant improvements over widely used activation functions for different tasks. In image classification, APALU increases MobileNet and GoogleNet accuracy by 0.37% and 0.04%, respectively, on the CIFAR10 dataset. In anomaly detection, it improves the average area under the curve of One-CLASS Deep SVDD by 0.8% on the MNIST dataset, 1.81% and 1.11% improvements with DifferNet, and knowledge distillation, respectively, on the MVTech dataset. Notably, APALU achieves 100% accuracy on a sign language recognition task with a limited dataset. For regression tasks, APALU enhances the performance of deep neural networks and recurrent neural networks on different datasets. These improvements highlight the robustness and adaptability of APALU across diverse deep-learning applications.

Related papers

Task-Specific Activation Functions for Neuroevolution using Grammatical Evolution [0.0]
We introduce Neuvo GEAF - an innovative approach leveraging grammatical evolution (GE) to automatically evolve novel activation functions. Experiments conducted on well-known binary classification datasets show statistically significant improvements in F1-score (between 2.4% and 9.4%) over ReLU.
arXiv Detail & Related papers (2025-03-13T20:50:21Z)
Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Activation function optimization method: Learnable series linear units (LSLUs) [12.089173508371246]
We propose a series-based learnable ac-tivation function called LSLU (Learnable Series Linear Units) This method simplifies deep learning networks while im-proving accuracy. We evaluate LSLU's performance on CIFAR10, CIFAR100, and specific task datasets (e.g., Silkworm)
arXiv Detail & Related papers (2024-08-28T11:12:27Z)
BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data. Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z)
A Non-monotonic Smooth Activation Function [4.269446061678759]
Activation functions are crucial in deep learning models since they introduce non-linearity into the networks. In this study, we propose a new activation function called Sqish, which is a non-monotonic and smooth function. We showed its superiority in classification, object detection, segmentation tasks, and adversarial robustness experiments.
arXiv Detail & Related papers (2023-10-16T07:09:47Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance [2.458437232470188]
We investigate the differentiability, boundedness, stationarity, and smoothness properties of the GELU activation function. Our results demonstrate the superior performance of GELU compared to other activation functions.
arXiv Detail & Related papers (2023-05-20T03:22:43Z)
Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data. RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method. It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO. PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z)
Efficient Feature Transformations for Discriminative and Generative Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning. Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture. We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z)
Discovering Parametric Activation Functions [17.369163074697475]
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective.
arXiv Detail & Related papers (2020-06-05T00:25:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.