Activations Through Extensions: A Framework To Boost Performance Of Neural Networks
- URL: http://arxiv.org/abs/2408.03599v2
- Date: Fri, 16 Aug 2024 01:19:04 GMT
- Title: Activations Through Extensions: A Framework To Boost Performance Of Neural Networks
- Authors: Chandramouli Kamanchi, Sumanta Mukherjee, Kameshwaran Sampath, Pankaj Dayama, Arindam Jati, Vijay Ekambaram, Dzung Phan,
- Abstract summary: Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs.
We propose a framework that unifies several works on activation functions and theoretically explains the performance benefits of these works.
- Score: 6.302159507265204
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. Typical choices for activation functions are ReLU, Tanh, Sigmoid etc., where the choice generally depends on the application domain. In this work, we propose a framework/strategy that unifies several works on activation functions and theoretically explains the performance benefits of these works. We also propose novel techniques that originate from the framework and allow us to obtain ``extensions'' (i.e. special generalizations of a given neural network) of neural networks through operations on activation functions. We theoretically and empirically show that ``extensions'' of neural networks have performance benefits compared to vanilla neural networks with insignificant space and time complexity costs on standard test functions. We also show the benefits of neural network ``extensions'' in the time-series domain on real-world datasets.
Related papers
- TSSR: A Truncated and Signed Square Root Activation Function for Neural
Networks [5.9622541907827875]
We introduce a new activation function called the Truncated and Signed Square Root (TSSR) function.
This function is distinctive because it is odd, nonlinear, monotone and differentiable.
It has the potential to improve the numerical stability of neural networks.
arXiv Detail & Related papers (2023-08-09T09:40:34Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Properties and Potential Applications of Random Functional-Linked Types
of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure.
This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain.
We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z) - TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - A survey on recently proposed activation functions for Deep Learning [0.0]
This survey discusses the main concepts of activation functions in neural networks.
It includes a brief introduction to deep neural networks, a summary of what are activation functions and how they are used in neural networks, their most common properties, the different types of activation functions, some of the challenges, limitations, and alternative solutions faced by activation functions.
arXiv Detail & Related papers (2022-04-06T16:21:52Z) - Activation Functions in Artificial Neural Networks: A Systematic
Overview [0.3553493344868413]
Activation functions shape the outputs of artificial neurons.
This paper provides an analytic yet up-to-date overview of popular activation functions and their properties.
arXiv Detail & Related papers (2021-01-25T08:55:26Z) - Review and Comparison of Commonly Used Activation Functions for Deep
Neural Networks [0.0]
It is critical to choose the most appropriate activation function in neural networks calculation.
This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth.
arXiv Detail & Related papers (2020-10-15T11:09:34Z) - Rational neural networks [3.4376560669160394]
We consider neural networks with rational activation functions.
We prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth.
arXiv Detail & Related papers (2020-04-04T10:36:11Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.