Designing Universal Causal Deep Learning Models: The Case of
Infinite-Dimensional Dynamical Systems from Stochastic Analysis
- URL: http://arxiv.org/abs/2210.13300v2
- Date: Tue, 9 May 2023 13:06:55 GMT
- Title: Designing Universal Causal Deep Learning Models: The Case of
Infinite-Dimensional Dynamical Systems from Stochastic Analysis
- Authors: Luca Galimberti, Anastasis Kratsios, Giulia Livieri
- Abstract summary: Causal operators (COs) play a central role in contemporary analysis.
There is still no canonical framework for designing Deep Learning (DL) models capable of approximating COs.
This paper proposes a "geometry-aware" solution to this open problem by introducing a DL model-design framework.
- Score: 3.5450828190071655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal operators (CO), such as various solution operators to stochastic
differential equations, play a central role in contemporary stochastic
analysis; however, there is still no canonical framework for designing Deep
Learning (DL) models capable of approximating COs. This paper proposes a
"geometry-aware'" solution to this open problem by introducing a DL
model-design framework that takes suitable infinite-dimensional linear metric
spaces as inputs and returns a universal sequential DL model adapted to these
linear geometries. We call these models Causal Neural Operators (CNOs). Our
main result states that the models produced by our framework can uniformly
approximate on compact sets and across arbitrarily finite-time horizons
H\"older or smooth trace class operators, which causally map sequences between
given linear metric spaces. Our analysis uncovers new quantitative
relationships on the latent state-space dimension of CNOs which even have new
implications for (classical) finite-dimensional Recurrent Neural Networks
(RNNs). We find that a linear increase of the CNO's (or RNN's) latent parameter
space's dimension and of its width, and a logarithmic increase of its depth
imply an exponential increase in the number of time steps for which its
approximation remains valid. A direct consequence of our analysis shows that
RNNs can approximate causal functions using exponentially fewer parameters than
ReLU networks.
Related papers
- A graph convolutional autoencoder approach to model order reduction for
parametrized PDEs [0.8192907805418583]
The present work proposes a framework for nonlinear model order reduction based on a Graph Convolutional Autoencoder (GCA-ROM)
We develop a non-intrusive and data-driven nonlinear reduction approach, exploiting GNNs to encode the reduced manifold and enable fast evaluations of parametrized PDEs.
arXiv Detail & Related papers (2023-05-15T12:01:22Z) - Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.
DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations.
We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations.
We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - A Deep Learning approach to Reduced Order Modelling of Parameter
Dependent Partial Differential Equations [0.2148535041822524]
We develop a constructive approach based on Deep Neural Networks for the efficient approximation of the parameter-to-solution map.
In particular, we consider parametrized advection-diffusion PDEs, and we test the methodology in the presence of strong transport fields.
arXiv Detail & Related papers (2021-03-10T17:01:42Z) - A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs.
We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields.
Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.