Metric-Entropy Limits on Nonlinear Dynamical System Learning
- URL: http://arxiv.org/abs/2407.01250v1
- Date: Mon, 1 Jul 2024 12:57:03 GMT
- Title: Metric-Entropy Limits on Nonlinear Dynamical System Learning
- Authors: Yang Pan, Clemens Hutter, Helmut Bölcskei,
- Abstract summary: We show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner.
As the sets of sequence-to-sequence maps we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed.
- Score: 4.069144210024563
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is concerned with the fundamental limits of nonlinear dynamical system learning from input-output traces. Specifically, we show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner. As the sets of sequence-to-sequence maps realized by the dynamical systems we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed, namely in terms of order, type, and generalized dimension. We compute these quantities for the classes of exponentially-decaying and polynomially-decaying Lipschitz fading-memory systems and show that RNNs can achieve them.
Related papers
- Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - Koopman-based Deep Learning for Nonlinear System Estimation [1.3791394805787949]
We present a novel data-driven linear estimator based on Koopman operator theory to extract meaningful finite-dimensional representations of complex non-linear systems.
Our estimator is also adaptive to a diffeomorphic transformation of the estimated nonlinear system, which enables it to compute optimal state estimates without re-learning.
arXiv Detail & Related papers (2024-05-01T16:49:54Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems [7.045072177165241]
We augment a piecewise-linear recurrent neural network (RNN) by a linear spline basis expansion.
We show that this approach retains all the theoretically appealing properties of the simple PLRNN, yet boosts its capacity for approximating arbitrary nonlinear dynamical systems in comparatively low dimensions.
arXiv Detail & Related papers (2022-07-06T09:43:03Z) - Learn Like The Pro: Norms from Theory to Size Neural Computation [3.848947060636351]
We investigate how dynamical systems with nonlinearities can inform the design of neural systems that seek to emulate them.
We propose a Learnability metric and quantify its associated features to the near-equilibrium behavior of learning dynamics.
It reveals exact sizing for a class of neural networks with multiplicative nodes that mimic continuous- or discrete-time dynamics.
arXiv Detail & Related papers (2021-06-21T20:58:27Z) - Metric Entropy Limits on Recurrent Neural Network Learning of Linear
Dynamical Systems [0.0]
We show that RNNs can optimally learn - or identify in system-theory parlance - stable LTI systems.
For LTI systems whose input-output relation is characterized through a difference equation, this means that RNNs can learn the difference equation from input-output traces in a metric-entropy optimal manner.
arXiv Detail & Related papers (2021-05-06T10:12:30Z) - Linear embedding of nonlinear dynamical systems and prospects for
efficient quantum algorithms [74.17312533172291]
We describe a method for mapping any finite nonlinear dynamical system to an infinite linear dynamical system (embedding)
We then explore an approach for approximating the resulting infinite linear system with finite linear systems (truncation)
arXiv Detail & Related papers (2020-12-12T00:01:10Z) - Statistical Mechanics of Deep Linear Neural Networks: The
Back-Propagating Renormalization Group [4.56877715768796]
We study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear.
We solve exactly the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space.
Our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks with modest depth.
arXiv Detail & Related papers (2020-12-07T20:08:31Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.