Related papers: Activations Through Extensions: A Framework To Boost Performance Of Neural Networks

Activations Through Extensions: A Framework To Boost Performance Of Neural Networks

URL: http://arxiv.org/abs/2408.03599v2
Date: Fri, 16 Aug 2024 01:19:04 GMT
Title: Activations Through Extensions: A Framework To Boost Performance Of Neural Networks
Authors: Chandramouli Kamanchi, Sumanta Mukherjee, Kameshwaran Sampath, Pankaj Dayama, Arindam Jati, Vijay Ekambaram, Dzung Phan,
Abstract summary: Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. We propose a framework that unifies several works on activation functions and theoretically explains the performance benefits of these works.
Score: 6.302159507265204
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. Typical choices for activation functions are ReLU, Tanh, Sigmoid etc., where the choice generally depends on the application domain. In this work, we propose a framework/strategy that unifies several works on activation functions and theoretically explains the performance benefits of these works. We also propose novel techniques that originate from the framework and allow us to obtain ``extensions'' (i.e. special generalizations of a given neural network) of neural networks through operations on activation functions. We theoretically and empirically show that ``extensions'' of neural networks have performance benefits compared to vanilla neural networks with insignificant space and time complexity costs on standard test functions. We also show the benefits of neural network ``extensions'' in the time-series domain on real-world datasets.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
TSSR: A Truncated and Signed Square Root Activation Function for Neural Networks [5.9622541907827875]
We introduce a new activation function called the Truncated and Signed Square Root (TSSR) function. This function is distinctive because it is odd, nonlinear, monotone and differentiable. It has the potential to improve the numerical stability of neural networks.
arXiv Detail & Related papers (2023-08-09T09:40:34Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Properties and Potential Applications of Random Functional-Linked Types of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure. This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain. We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z)
TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS) TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions. We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z)
A survey on recently proposed activation functions for Deep Learning [0.0]
This survey discusses the main concepts of activation functions in neural networks. It includes a brief introduction to deep neural networks, a summary of what are activation functions and how they are used in neural networks, their most common properties, the different types of activation functions, some of the challenges, limitations, and alternative solutions faced by activation functions.
arXiv Detail & Related papers (2022-04-06T16:21:52Z)
Activation Functions in Artificial Neural Networks: A Systematic Overview [0.3553493344868413]
Activation functions shape the outputs of artificial neurons. This paper provides an analytic yet up-to-date overview of popular activation functions and their properties.
arXiv Detail & Related papers (2021-01-25T08:55:26Z)
Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks [0.0]
It is critical to choose the most appropriate activation function in neural networks calculation. This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth.
arXiv Detail & Related papers (2020-10-15T11:09:34Z)
Rational neural networks [3.4376560669160394]
We consider neural networks with rational activation functions. We prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth.
arXiv Detail & Related papers (2020-04-04T10:36:11Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.