ALReLU: A different approach on Leaky ReLU activation function to
improve Neural Networks Performance
- URL: http://arxiv.org/abs/2012.07564v2
- Date: Thu, 29 Apr 2021 07:33:06 GMT
- Title: ALReLU: A different approach on Leaky ReLU activation function to
improve Neural Networks Performance
- Authors: Stamatis Mastromichalakis
- Abstract summary: The classical ReLU activation function (AF) has been extensively applied in Deep Neural Networks (DNN)
The common gradient issues of ReLU pose challenges in applications on academy and industry sectors.
The Absolute Leaky ReLU (ALReLU) AF, a variation of LReLU, is proposed as an alternative method to resolve the common 'dying ReLU problem'
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Despite the unresolved 'dying ReLU problem', the classical ReLU activation
function (AF) has been extensively applied in Deep Neural Networks (DNN), in
particular Convolutional Neural Networks (CNN), for image classification. The
common gradient issues of ReLU pose challenges in applications on academy and
industry sectors. Recent approaches for improvements are in a similar direction
by just proposing variations of the AF, such as Leaky ReLU (LReLU), while
maintaining the solution within the same unresolved gradient problems. In this
paper, the Absolute Leaky ReLU (ALReLU) AF, a variation of LReLU, is proposed,
as an alternative method to resolve the common 'dying ReLU problem' on NN-based
algorithms for supervised learning. The experimental results demonstrate that
by using the absolute values of LReLU's small negative gradient, has a
significant improvement in comparison with LReLU and ReLU, on image
classification of diseases such as COVID-19, text and tabular data
classification tasks on five different datasets.
Related papers
- Leaky ReLUs That Differ in Forward and Backward Pass Facilitate Activation Maximization in Deep Neural Networks [0.022344294014777957]
Activation (AM) strives to generate optimal input, revealing features that trigger high responses in trained deep neural networks.
We show that AM fails to produce optimal input for simple functions containing ReLUs or Leaky ReLUs.
We propose a solution based on using Leaky ReLUs with a high negative slope in the backward pass while keeping the original, usually zero, slope in the forward pass.
arXiv Detail & Related papers (2024-10-22T12:38:39Z) - SwishReLU: A Unified Approach to Activation Functions for Enhanced Deep Neural Networks Performance [1.2724528787590168]
ReLU, a commonly used activation function in deep neural networks, is prone to the issue of "Dying ReLU"
Several enhanced versions, such as ELU, SeLU, and Swish, have been introduced and are considered to be less commonly utilized.
This paper proposes SwishReLU, a novel activation function combining elements of ReLU and Swish.
arXiv Detail & Related papers (2024-07-11T07:14:34Z) - Equidistribution-based training of Free Knot Splines and ReLU Neural Networks [0.0]
We show that the $L$ based approximation problem is ill-conditioned using shallow neural networks (NNs) with a rectified linear unit (ReLU) activation function.
We propose a two-level procedure for training the FKS by first solving the nonlinear problem of finding the optimal knot locations.
We then determine the optimal weights and knots of the FKS by solving a nearly linear, well-conditioned problem.
arXiv Detail & Related papers (2024-07-02T10:51:36Z) - ReLUs Are Sufficient for Learning Implicit Neural Representations [17.786058035763254]
We revisit the use of ReLU activation functions for learning implicit neural representations.
Inspired by second order B-spline wavelets, we incorporate a set of simple constraints to the ReLU neurons in each layer of a deep neural network (DNN)
We demonstrate that, contrary to popular belief, one can learn state-of-the-art INRs based on a DNN composed of only ReLU neurons.
arXiv Detail & Related papers (2024-06-04T17:51:08Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Optimal Sets and Solution Paths of ReLU Networks [56.40911684005949]
We develop an analytical framework to characterize the set of optimal ReLU networks.
We establish conditions for the neuralization of ReLU networks to be continuous, and develop sensitivity results for ReLU networks.
arXiv Detail & Related papers (2023-05-31T18:48:16Z) - An Inexact Augmented Lagrangian Algorithm for Training Leaky ReLU Neural
Network with Group Sparsity [13.27709100571336]
A leaky ReLU network with a group regularization term has been widely used in the recent years.
We show that there is a lack of approaches to compute a stationary point deterministically.
We propose an inexact augmented Lagrangian algorithm for solving the new model.
arXiv Detail & Related papers (2022-05-11T11:53:15Z) - Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions [47.276004075767176]
We develop software for convex optimization of two-layer neural networks with ReLU activation functions.
We show that convex gated ReLU models obtain data-dependent algorithms for the ReLU training problem.
arXiv Detail & Related papers (2022-02-02T23:50:53Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - A ReLU Dense Layer to Improve the Performance of Neural Networks [40.2470651460466]
We propose ReDense as a simple and low complexity way to improve the performance of trained neural networks.
We experimentally show that ReDense can improve the training and testing performance of various neural network architectures.
arXiv Detail & Related papers (2020-10-22T11:56:01Z) - Boosting Gradient for White-Box Adversarial Attacks [60.422511092730026]
We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms.
Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
arXiv Detail & Related papers (2020-10-21T02:13:26Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z) - Dynamic ReLU [74.973224160508]
We propose dynamic ReLU (DY-ReLU), a dynamic input of parameters which are generated by a hyper function over all in-put elements.
Compared to its static counterpart, DY-ReLU has negligible extra computational cost, but significantly more representation capability.
By simply using DY-ReLU for MobileNetV2, the top-1 accuracy on ImageNet classification is boosted from 72.0% to 76.2% with only 5% additional FLOPs.
arXiv Detail & Related papers (2020-03-22T23:45:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.