Related papers: Deep learning with transfer functions: new applications in system identification

Deep learning with transfer functions: new applications in system identification

URL: http://arxiv.org/abs/2104.09839v1
Date: Tue, 20 Apr 2021 08:58:55 GMT
Title: Deep learning with transfer functions: new applications in system identification
Authors: Dario Piga, Marco Forgione, Manas Mejari
Abstract summary: This paper presents a linear dynamical operator endowed with a well-defined and efficient back-propagation behavior for automatic derivatives computation. The operator enables end-to-end training of structured networks containing linear transfer functions and other differentiable units.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a linear dynamical operator described in terms of a rational transfer function, endowed with a well-defined and efficient back-propagation behavior for automatic derivatives computation. The operator enables end-to-end training of structured networks containing linear transfer functions and other differentiable units {by} exploiting standard deep learning software. Two relevant applications of the operator in system identification are presented. The first one consists in the integration of {prediction error methods} in deep learning. The dynamical operator is included as {the} last layer of a neural network in order to obtain the optimal one-step-ahead prediction error. The second one considers identification of general block-oriented models from quantized data. These block-oriented models are constructed by combining linear dynamical operators with static nonlinearities described as standard feed-forward neural networks. A custom loss function corresponding to the log-likelihood of quantized output observations is defined. For gradient-based optimization, the derivatives of the log-likelihood are computed by applying the back-propagation algorithm through the whole network. Two system identification benchmarks are used to show the effectiveness of the proposed methodologies.

Related papers

Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems [35.207148596300684]
The Koopman operator provides a principled framework for analyzing nonlinear dynamical systems.<n>VAMPnet and DPNet have been proposed to learn the leading singular subspaces of the Koopman operator.<n>We propose a scalable and conceptually simple method for learning the top-k singular functions of the Koopman operator.
arXiv Detail & Related papers (2025-07-09T18:55:48Z)
DeltaPhi: Learning Physical Trajectory Residual for PDE Solving [54.13671100638092]
We propose and formulate the Physical Trajectory Residual Learning (DeltaPhi) We learn the surrogate model for the residual operator mapping based on existing neural operator networks. We conclude that, compared to direct learning, physical residual learning is preferred for PDE solving.
arXiv Detail & Related papers (2024-06-14T07:45:07Z)
Linearization Turns Neural Operators into Function-Valued Gaussian Processes [23.85470417458593]
We introduce a new framework for approximate Bayesian uncertainty quantification in neural operators. Our approach can be interpreted as a probabilistic analogue of the concept of currying from functional programming. We showcase the efficacy of our approach through applications to different types of partial differential equations.
arXiv Detail & Related papers (2024-06-07T16:43:54Z)
PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers [5.263113622394007]
We develop a new neural network framework for predicting differential equations. By using a transformer structure and a feature fusion approach, our network can simultaneously embed sets of solution operators for various parametric differential equations. The network is shown to be able to handle noise in the data and errors in the symbolic representation, including noisy numerical values, model misspecification, and erroneous addition or deletion of terms.
arXiv Detail & Related papers (2023-09-28T19:46:07Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
Lifted Bregman Training of Neural Networks [28.03724379169264]
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions. We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding.
arXiv Detail & Related papers (2022-08-18T11:12:52Z)
Consensus Function from an $L_p^q-$norm Regularization Term for its Use as Adaptive Activation Functions in Neural Networks [0.0]
We propose the definition and utilization of an implicit, parametric, non-linear activation function that adapts its shape during the training process. This fact increases the space of parameters to optimize within the network, but it allows a greater flexibility and generalizes the concept of neural networks. Preliminary results show that the use of these neural networks with this type of adaptive activation functions reduces the error in regression and classification examples.
arXiv Detail & Related papers (2022-06-30T04:48:14Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Relaxing the Constraints on Predictive Coding Models [62.997667081978825]
Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs is the minimization of prediction errors. Standard implementations of the algorithm still involve potentially neurally implausible features such as identical forward and backward weights, backward nonlinear derivatives, and 1-1 error unit connectivity. In this paper, we show that these features are not integral to the algorithm and can be removed either directly or through learning additional sets of parameters with Hebbian update rules without noticeable harm to learning performance.
arXiv Detail & Related papers (2020-10-02T15:21:37Z)
Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system. Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
dynoNet: a neural network architecture for learning dynamical systems [0.0]
This paper introduces a network architecture, called dynoNet, utilizing linear dynamical operators as elementary building blocks. The back-propagation behavior of the linear dynamical operator with respect to both its parameters and its input sequence is defined.
arXiv Detail & Related papers (2020-06-03T13:10:02Z)
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.