Learning Recurrent Neural Net Models of Nonlinear Systems
- URL: http://arxiv.org/abs/2011.09573v4
- Date: Tue, 16 Nov 2021 19:57:23 GMT
- Title: Learning Recurrent Neural Net Models of Nonlinear Systems
- Authors: Joshua Hanson, Maxim Raginsky, and Eduardo Sontag
- Abstract summary: We find a continuous-time recurrent neural net with hyperbolic tangent activation function that approximately reproduces the underlying i/o behavior with high confidence.
We derive quantitative guarantees on the sup-norm risk of the learned model in terms of the number of neurons, the sample size, the number of derivatives being matched, and the regularity properties of the inputs, the outputs, and the unknown i/o map.
- Score: 10.5811404306981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the following learning problem: Given sample pairs of input and
output signals generated by an unknown nonlinear system (which is not assumed
to be causal or time-invariant), we wish to find a continuous-time recurrent
neural net with hyperbolic tangent activation function that approximately
reproduces the underlying i/o behavior with high confidence. Leveraging earlier
work concerned with matching output derivatives up to a given finite order, we
reformulate the learning problem in familiar system-theoretic language and
derive quantitative guarantees on the sup-norm risk of the learned model in
terms of the number of neurons, the sample size, the number of derivatives
being matched, and the regularity properties of the inputs, the outputs, and
the unknown i/o map.
Related papers
- Metric-Entropy Limits on Nonlinear Dynamical System Learning [4.069144210024563]
We show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner.
As the sets of sequence-to-sequence maps we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed.
arXiv Detail & Related papers (2024-07-01T12:57:03Z) - Learning Linearized Models from Nonlinear Systems with Finite Data [1.6026317505839445]
We consider the problem of identifying a linearized model when the true underlying dynamics is nonlinear.
We provide a multiple trajectories-based deterministic data acquisition algorithm followed by a regularized least squares algorithm.
Our error bound demonstrates a trade-off between the error due to nonlinearity and the error due to noise, and shows that one can learn the linearized dynamics with arbitrarily small error.
arXiv Detail & Related papers (2023-09-15T22:58:03Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - MP-GELU Bayesian Neural Networks: Moment Propagation by GELU
Nonlinearity [0.0]
We propose a novel nonlinear function named moment propagating-Gaussian error linear unit (MP-GELU) that enables the fast derivation of first and second moments in BNNs.
MP-GELU provides higher prediction accuracy and better quality of uncertainty with faster execution than those of ReLU-based BNNs.
arXiv Detail & Related papers (2022-11-24T03:37:29Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Distributed Nonlinear State Estimation in Electric Power Systems using
Graph Neural Networks [1.1470070927586016]
This paper introduces an original graph neural network based SE implementation over the augmented factor graph of the nonlinear power system SE.
The proposed regression model has linear computational complexity during the inference time once trained, with a possibility of distributed implementation.
arXiv Detail & Related papers (2022-07-23T08:54:24Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - Recurrent Neural Network Training with Convex Loss and Regularization
Functions by Extended Kalman Filtering [0.20305676256390928]
We show that the learning method outperforms gradient descent in a nonlinear system identification benchmark.
We also explore the use of the algorithm in data-driven nonlinear model predictive control and its relation with disturbance models for offset-free tracking.
arXiv Detail & Related papers (2021-11-04T07:49:15Z) - Consistency of mechanistic causal discovery in continuous-time using
Neural ODEs [85.7910042199734]
We consider causal discovery in continuous-time for the study of dynamical systems.
We propose a causal discovery algorithm based on penalized Neural ODEs.
arXiv Detail & Related papers (2021-05-06T08:48:02Z) - Neural ODE Processes [64.10282200111983]
We introduce Neural ODE Processes (NDPs), a new class of processes determined by a distribution over Neural ODEs.
We show that our model can successfully capture the dynamics of low-dimensional systems from just a few data-points.
arXiv Detail & Related papers (2021-03-23T09:32:06Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.