Related papers: ENN: A Neural Network with DCT Adaptive Activation Functions

ENN: A Neural Network with DCT Adaptive Activation Functions

URL: http://arxiv.org/abs/2307.00673v3
Date: Tue, 30 Jan 2024 13:21:24 GMT
Title: ENN: A Neural Network with DCT Adaptive Activation Functions
Authors: Marc Martinez-Gost, Ana P\'erez-Neira, Miguel \'Angel Lagunas
Abstract summary: We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
Score: 2.2713084727838115
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The expressiveness of neural networks highly depends on the nature of the activation function, although these are usually assumed predefined and fixed during the training stage. Under a signal processing perspective, in this paper we present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) and adapted using backpropagation during training. This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. This is the first non-linear model for activation functions that relies on a signal processing perspective, providing high flexibility and expressiveness to the network. We contribute with insights in the explainability of the network at convergence by recovering the concept of bump, this is, the response of each activation function in the output space. Finally, through exhaustive experiments we show that the model can adapt to classification and regression tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Fractional Concepts in Neural Networks: Enhancing Activation and Loss Functions [0.7614628596146602]
The paper presents a method for using fractional concepts in a neural network to modify the activation and loss functions. This will enable neurons in the network to adjust their activation functions to match input data better and reduce output errors.
arXiv Detail & Related papers (2023-10-18T10:49:29Z)
ASU-CNN: An Efficient Deep Architecture for Image Classification and Feature Visualizations [0.0]
Activation functions play a decisive role in determining the capacity of Deep Neural Networks. In this paper, a Convolutional Neural Network model named as ASU-CNN is proposed. The network achieved promising results on both training and testing data for the classification of CIFAR-10.
arXiv Detail & Related papers (2023-05-28T16:52:25Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions. We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis. We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z)
Consensus Function from an $L_p^q-$norm Regularization Term for its Use as Adaptive Activation Functions in Neural Networks [0.0]
We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process. This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks. Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
arXiv Detail & Related papers (2022-06-30T04:48:14Z)
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic [137.04558017227583]
Actor-critic (AC) algorithms, empowered by neural networks, have had significant empirical success in recent years. We take a mean-field perspective on the evolution and convergence of feature-based neural AC. We prove that neural AC finds the globally optimal policy at a sublinear rate.
arXiv Detail & Related papers (2021-12-27T06:09:50Z)
Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks. We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem. A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z)
Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks. Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z)
Data-Driven Learning of Feedforward Neural Networks with Different Activation Functions [0.0]
This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning.
arXiv Detail & Related papers (2021-07-04T18:20:27Z)
Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance. distinct models are required to be trained to reach different points in the rate-distortion (R-D) space. We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z)
Activation function design for deep networks: linearity and effective initialisation [10.108857371774977]
We study how to avoid two problems at initialisation identified in prior works. We prove that both these problems can be avoided by choosing an activation function possessing a sufficiently large linear region around the origin.
arXiv Detail & Related papers (2021-05-17T11:30:46Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.