Deep learning with transfer functions: new applications in system
identification
- URL: http://arxiv.org/abs/2104.09839v1
- Date: Tue, 20 Apr 2021 08:58:55 GMT
- Title: Deep learning with transfer functions: new applications in system
identification
- Authors: Dario Piga, Marco Forgione, Manas Mejari
- Abstract summary: This paper presents a linear dynamical operator endowed with a well-defined and efficient back-propagation behavior for automatic derivatives computation.
The operator enables end-to-end training of structured networks containing linear transfer functions and other differentiable units.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a linear dynamical operator described in terms of a
rational transfer function, endowed with a well-defined and efficient
back-propagation behavior for automatic derivatives computation. The operator
enables end-to-end training of structured networks containing linear transfer
functions and other differentiable units {by} exploiting standard deep learning
software.
Two relevant applications of the operator in system identification are
presented. The first one consists in the integration of {prediction error
methods} in deep learning. The dynamical operator is included as {the} last
layer of a neural network in order to obtain the optimal one-step-ahead
prediction error.
The second one considers identification of general block-oriented models from
quantized data. These block-oriented models are constructed by combining linear
dynamical operators with static nonlinearities described as standard
feed-forward neural networks. A custom loss function corresponding to the
log-likelihood of quantized output observations is defined. For gradient-based
optimization, the derivatives of the log-likelihood are computed by applying
the back-propagation algorithm through the whole network. Two system
identification benchmarks are used to show the effectiveness of the proposed
methodologies.
Related papers
- DeltaPhi: Learning Physical Trajectory Residual for PDE Solving [54.13671100638092]
We propose and formulate the Physical Trajectory Residual Learning (DeltaPhi)
We learn the surrogate model for the residual operator mapping based on existing neural operator networks.
We conclude that, compared to direct learning, physical residual learning is preferred for PDE solving.
arXiv Detail & Related papers (2024-06-14T07:45:07Z) - Linearization Turns Neural Operators into Function-Valued Gaussian Processes [23.85470417458593]
We introduce a new framework for approximate Bayesian uncertainty quantification in neural operators.
Our approach can be interpreted as a probabilistic analogue of the concept of currying from functional programming.
We showcase the efficacy of our approach through applications to different types of partial differential equations.
arXiv Detail & Related papers (2024-06-07T16:43:54Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Lifted Bregman Training of Neural Networks [28.03724379169264]
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions.
This formulation is based on Bregman and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions.
We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding.
arXiv Detail & Related papers (2022-08-18T11:12:52Z) - Consensus Function from an $L_p^q-$norm Regularization Term for its Use
as Adaptive Activation Functions in Neural Networks [0.0]
We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process.
This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks.
Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
arXiv Detail & Related papers (2022-06-30T04:48:14Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - Relaxing the Constraints on Predictive Coding Models [62.997667081978825]
Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs is the minimization of prediction errors.
Standard implementations of the algorithm still involve potentially neurally implausible features such as identical forward and backward weights, backward nonlinear derivatives, and 1-1 error unit connectivity.
In this paper, we show that these features are not integral to the algorithm and can be removed either directly or through learning additional sets of parameters with Hebbian update rules without noticeable harm to learning performance.
arXiv Detail & Related papers (2020-10-02T15:21:37Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - dynoNet: a neural network architecture for learning dynamical systems [0.0]
This paper introduces a network architecture, called dynoNet, utilizing linear dynamical operators as elementary building blocks.
The back-propagation behavior of the linear dynamical operator with respect to both its parameters and its input sequence is defined.
arXiv Detail & Related papers (2020-06-03T13:10:02Z) - Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks.
We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.