ENN: A Neural Network with DCT Adaptive Activation Functions
- URL: http://arxiv.org/abs/2307.00673v3
- Date: Tue, 30 Jan 2024 13:21:24 GMT
- Title: ENN: A Neural Network with DCT Adaptive Activation Functions
- Authors: Marc Martinez-Gost, Ana P\'erez-Neira, Miguel \'Angel Lagunas
- Abstract summary: We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
- Score: 2.2713084727838115
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The expressiveness of neural networks highly depends on the nature of the
activation function, although these are usually assumed predefined and fixed
during the training stage. Under a signal processing perspective, in this paper
we present Expressive Neural Network (ENN), a novel model in which the
non-linear activation functions are modeled using the Discrete Cosine Transform
(DCT) and adapted using backpropagation during training. This parametrization
keeps the number of trainable parameters low, is appropriate for gradient-based
schemes, and adapts to different learning tasks. This is the first non-linear
model for activation functions that relies on a signal processing perspective,
providing high flexibility and expressiveness to the network. We contribute
with insights in the explainability of the network at convergence by recovering
the concept of bump, this is, the response of each activation function in the
output space. Finally, through exhaustive experiments we show that the model
can adapt to classification and regression tasks. The performance of ENN
outperforms state of the art benchmarks, providing above a 40% gap in accuracy
in some scenarios.
Related papers
- Fractional Concepts in Neural Networks: Enhancing Activation and Loss
Functions [0.7614628596146602]
The paper presents a method for using fractional concepts in a neural network to modify the activation and loss functions.
This will enable neurons in the network to adjust their activation functions to match input data better and reduce output errors.
arXiv Detail & Related papers (2023-10-18T10:49:29Z) - ASU-CNN: An Efficient Deep Architecture for Image Classification and
Feature Visualizations [0.0]
Activation functions play a decisive role in determining the capacity of Deep Neural Networks.
In this paper, a Convolutional Neural Network model named as ASU-CNN is proposed.
The network achieved promising results on both training and testing data for the classification of CIFAR-10.
arXiv Detail & Related papers (2023-05-28T16:52:25Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Consensus Function from an $L_p^q-$norm Regularization Term for its Use
as Adaptive Activation Functions in Neural Networks [0.0]
We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process.
This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks.
Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
arXiv Detail & Related papers (2022-06-30T04:48:14Z) - Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic [137.04558017227583]
Actor-critic (AC) algorithms, empowered by neural networks, have had significant empirical success in recent years.
We take a mean-field perspective on the evolution and convergence of feature-based neural AC.
We prove that neural AC finds the globally optimal policy at a sublinear rate.
arXiv Detail & Related papers (2021-12-27T06:09:50Z) - Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na
previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks.
We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem.
A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Activation function design for deep networks: linearity and effective
initialisation [10.108857371774977]
We study how to avoid two problems at initialisation identified in prior works.
We prove that both these problems can be avoided by choosing an activation function possessing a sufficiently large linear region around the origin.
arXiv Detail & Related papers (2021-05-17T11:30:46Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.