Nonlinearity Enhanced Adaptive Activation Functions
- URL: http://arxiv.org/abs/2403.19896v2
- Date: Mon, 12 May 2025 19:30:37 GMT
- Title: Nonlinearity Enhanced Adaptive Activation Functions
- Authors: David Yevick,
- Abstract summary: Examples are given based on the standard rectified linear unit (ReLU)<n>The associated accuracy improvement is quantified both in the context of the MNIST digit data set and a convolutional neural network (CNN) benchmark example.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A general procedure for introducing parametric, learned, nonlinearity into activation functions is found to enhance the accuracy of representative neural networks without requiring significant additional computational resources. Examples are given based on the standard rectified linear unit (ReLU) as well as several other frequently employed activation functions. The associated accuracy improvement is quantified both in the context of the MNIST digit data set and a convolutional neural network (CNN) benchmark example.
Related papers
- Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling [3.8719670789415925]
Recurrent configuration networks (RSCNs) have shown great potential in modelling nonlinear dynamic systems with uncertainties.<n>This paper presents an RSCN with hybrid regularization to enhance both the learning capacity and generalization performance of the network.
arXiv Detail & Related papers (2024-11-26T03:06:39Z) - Neural Network Verification with Branch-and-Bound for General Nonlinearities [63.39918329535165]
Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification.<n>We develop a general framework, named GenBaB, to conduct BaB on general nonlinearities to verify NNs with general architectures.<n>Our framework also allows the verification of general nonlinear graphs and enables verification applications beyond simple NNs.
arXiv Detail & Related papers (2024-05-31T17:51:07Z) - Nonlinear functional regression by functional deep neural network with kernel embedding [18.927592350748682]
We introduce a functional deep neural network with an adaptive and discretization-invariant dimension reduction method.<n>Explicit rates of approximating nonlinear smooth functionals across various input function spaces are derived.<n>We conduct numerical experiments on both simulated and real datasets to demonstrate the effectiveness and benefits of our functional net.
arXiv Detail & Related papers (2024-01-05T16:43:39Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning [53.97335841137496]
We propose an oracle-efficient algorithm, dubbed Pessimistic Least-Square Value Iteration (PNLSVI) for offline RL with non-linear function approximation.
Our algorithm enjoys a regret bound that has a tight dependency on the function class complexity and achieves minimax optimal instance-dependent regret when specialized to linear function approximation.
arXiv Detail & Related papers (2023-10-02T17:42:01Z) - Optimal Nonlinearities Improve Generalization Performance of Random
Features [0.9790236766474201]
Random feature model with a nonlinear activation function has been shown to performally equivalent to a Gaussian model in terms of training and generalization errors.
We show that acquired parameters from the Gaussian model enable us to define a set of optimal nonlinearities.
Our numerical results validate that the optimized nonlinearities achieve better generalization performance than widely-used nonlinear functions such as ReLU.
arXiv Detail & Related papers (2023-09-28T20:55:21Z) - ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Approximation of Nonlinear Functionals Using Deep ReLU Networks [7.876115370275732]
We investigate the approximation power of functional deep neural networks associated with the rectified linear unit (ReLU) activation function.
In addition, we establish rates of approximation of the proposed functional deep ReLU networks under mild regularity conditions.
arXiv Detail & Related papers (2023-04-10T08:10:11Z) - MP-GELU Bayesian Neural Networks: Moment Propagation by GELU
Nonlinearity [0.0]
We propose a novel nonlinear function named moment propagating-Gaussian error linear unit (MP-GELU) that enables the fast derivation of first and second moments in BNNs.
MP-GELU provides higher prediction accuracy and better quality of uncertainty with faster execution than those of ReLU-based BNNs.
arXiv Detail & Related papers (2022-11-24T03:37:29Z) - Consensus Function from an $L_p^q-$norm Regularization Term for its Use
as Adaptive Activation Functions in Neural Networks [0.0]
We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process.
This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks.
Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
arXiv Detail & Related papers (2022-06-30T04:48:14Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - Nonlinear Level Set Learning for Function Approximation on Sparse Data
with Applications to Parametric Differential Equations [6.184270985214254]
"Nonlinear Level set Learning" (NLL) approach is presented for the pointwise prediction of functions which have been sparsely sampled.
The proposed algorithm effectively reduces the input dimension to the theoretical lower bound with minor accuracy loss.
Experiments and applications are presented which compare this modified NLL with the original NLL and the Active Subspaces (AS) method.
arXiv Detail & Related papers (2021-04-29T01:54:05Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Soft-Root-Sign Activation Function [21.716884634290516]
"Soft-Root-Sign" (SRS) is smooth, non-monotonic, and bounded.
In contrast to ReLU, SRS can adaptively adjust the output by a pair of independent trainable parameters.
Our SRS matches or exceeds models with ReLU and other state-of-the-art nonlinearities.
arXiv Detail & Related papers (2020-03-01T18:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.