Consensus Function from an $L_p^q-$norm Regularization Term for its Use
as Adaptive Activation Functions in Neural Networks
- URL: http://arxiv.org/abs/2206.15017v1
- Date: Thu, 30 Jun 2022 04:48:14 GMT
- Title: Consensus Function from an $L_p^q-$norm Regularization Term for its Use
as Adaptive Activation Functions in Neural Networks
- Authors: Juan Heredia-Juesas and Jos\'e \'A. Mart\'inez-Lorenzo
- Abstract summary: We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process.
This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks.
Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The design of a neural network is usually carried out by defining the number
of layers, the number of neurons per layer, their connections or synapses, and
the activation function that they will execute. The training process tries to
optimize the weights assigned to those connections, together with the biases of
the neurons, to better fit the training data. However, the definition of the
activation functions is, in general, determined in the design process and not
modified during the training, meaning that their behavior is unrelated to the
training data set. In this paper we propose the definition and utilization of
an implicit, parametric, non-linear activation function that adapts its shape
during the training process. This fact increases the space of parameters to
optimize within the network, but it allows a greater flexibility and
generalizes the concept of neural networks. Furthermore, it simplifies the
architectural design since the same activation function definition can be
employed in each neuron, letting the training process to optimize their
parameters and, thus, their behavior. Our proposed activation function comes
from the definition of the consensus variable from the optimization of a linear
underdetermined problem with an $L_p^q$ regularization term, via the
Alternating Direction Method of Multipliers (ADMM). We define the neural
networks using this type of activation functions as $pq-$networks. Preliminary
results show that the use of these neural networks with this type of adaptive
activation functions reduces the error in regression and classification
examples, compared to equivalent regular feedforward neural networks with fixed
activation functions.
Related papers
- Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Fractional Concepts in Neural Networks: Enhancing Activation and Loss
Functions [0.7614628596146602]
The paper presents a method for using fractional concepts in a neural network to modify the activation and loss functions.
This will enable neurons in the network to adjust their activation functions to match input data better and reduce output errors.
arXiv Detail & Related papers (2023-10-18T10:49:29Z) - ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Data-aware customization of activation functions reduces neural network
error [0.35172332086962865]
We show that data-aware customization of activation functions can result in striking reductions in neural network error.
A simple substitution with the seagull'' activation function in an already-refined neural network can lead to an order-of-magnitude reduction in error.
arXiv Detail & Related papers (2023-01-16T23:38:37Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data.
We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way.
We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Neural networks with trainable matrix activation functions [7.999703756441757]
This work develops a systematic approach to constructing matrix-valued activation functions.
The proposed activation functions depend on parameters that are trained along with the weights and bias vectors.
arXiv Detail & Related papers (2021-09-21T04:11:26Z) - Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na
previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks.
We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem.
A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.