Parametric Rectified Power Sigmoid Units: Learning Nonlinear Neural
Transfer Analytical Forms
- URL: http://arxiv.org/abs/2101.09948v2
- Date: Fri, 5 Feb 2021 11:42:08 GMT
- Title: Parametric Rectified Power Sigmoid Units: Learning Nonlinear Neural
Transfer Analytical Forms
- Authors: Abdourrahmane Mahamane Atto (LISTIC), Sylvie Galichet (LISTIC),
Dominique Pastor, Nicolas M\'eger (LISTIC)
- Abstract summary: The paper proposes representation functionals in a dual paradigm where learning jointly concerns linear convolutional weights and parametric forms of nonlinear activation functions.
The nonlinear forms proposed for performing the functional representation are associated with a new class of parametric neural transfer functions called rectified power sigmoid units.
Performance achieved by the joint learning of convolutional and rectified power sigmoid learnable parameters are shown outstanding in both shallow and deep learning frameworks.
- Score: 1.6975704972827304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paper proposes representation functionals in a dual paradigm where
learning jointly concerns both linear convolutional weights and parametric
forms of nonlinear activation functions. The nonlinear forms proposed for
performing the functional representation are associated with a new class of
parametric neural transfer functions called rectified power sigmoid units. This
class is constructed to integrate both advantages of sigmoid and rectified
linear unit functions, in addition with rejecting the drawbacks of these
functions. Moreover, the analytic form of this new neural class involves scale,
shift and shape parameters so as to obtain a wide range of activation shapes,
including the standard rectified linear unit as a limit case. Parameters of
this neural transfer class are considered as learnable for the sake of
discovering the complex shapes that can contribute in solving machine learning
issues. Performance achieved by the joint learning of convolutional and
rectified power sigmoid learnable parameters are shown outstanding in both
shallow and deep learning frameworks. This class opens new prospects with
respect to machine learning in the sense that learnable parameters are not only
attached to linear transformations, but also to suitable nonlinear operators.
Related papers
- Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context [44.949726166566236]
We show that (non-linear) Transformers naturally learn to implement gradient descent in function space.
We also show that the optimal choice of non-linear activation depends in a natural way on the class of functions that need to be learned.
arXiv Detail & Related papers (2023-12-11T17:05:25Z) - Orders-of-coupling representation with a single neural network with
optimal neuron activation functions and without nonlinear parameter
optimization [0.0]
We show that neural network models of orders-of-coupling representations can be easily built by using a recently proposed neural network with optimal neuron activation functions.
Examples are given of representations of molecular potential energy surfaces.
arXiv Detail & Related papers (2023-02-11T06:27:26Z) - NOMAD: Nonlinear Manifold Decoders for Operator Learning [17.812064311297117]
Supervised learning in function spaces is an emerging area of machine learning research.
We show NOMAD, a novel operator learning framework with a nonlinear decoder map capable of learning finite dimensional representations of nonlinear submanifolds in function spaces.
arXiv Detail & Related papers (2022-06-07T19:52:44Z) - Decoupling multivariate functions using a nonparametric filtered tensor
decomposition [0.29360071145551075]
Decoupling techniques aim at providing an alternative representation of the nonlinearity.
The so-called decoupled form is often a more efficient parameterisation of the relationship while being highly structured, favouring interpretability.
In this work two new algorithms, based on filtered tensor decompositions of first order derivative information are introduced.
arXiv Detail & Related papers (2022-05-23T09:34:17Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - A Functional Perspective on Learning Symmetric Functions with Neural
Networks [48.80300074254758]
We study the learning and representation of neural networks defined on measures.
We establish approximation and generalization bounds under different choices of regularization.
The resulting models can be learned efficiently and enjoy generalization guarantees that extend across input sizes.
arXiv Detail & Related papers (2020-08-16T16:34:33Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z) - Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.