Related papers: Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

URL: http://arxiv.org/abs/2210.13300v3
Date: Thu, 10 Apr 2025 13:41:03 GMT
Title: Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis
Authors: Luca Galimberti, Anastasis Kratsios, Giulia Livieri,
Abstract summary: Several non-linear operators in analysis depend on a temporal structure which is not leveraged by contemporary neural operators.<n>This paper introduces a deep learning model-design framework that takes suitable infinite-dimensional linear metric spaces.<n>We show that our framework can uniformly approximate on compact sets and across arbitrarily finite-time horizons H" or smooth trace class operators.
Score: 7.373617024876726
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several non-linear operators in stochastic analysis, such as solution maps to stochastic differential equations, depend on a temporal structure which is not leveraged by contemporary neural operators designed to approximate general maps between Banach space. This paper therefore proposes an operator learning solution to this open problem by introducing a deep learning model-design framework that takes suitable infinite-dimensional linear metric spaces, e.g. Banach spaces, as inputs and returns a universal \textit{sequential} deep learning model adapted to these linear geometries specialized for the approximation of operators encoding a temporal structure. We call these models \textit{Causal Neural Operators}. Our main result states that the models produced by our framework can uniformly approximate on compact sets and across arbitrarily finite-time horizons H\"older or smooth trace class operators, which causally map sequences between given linear metric spaces. Our analysis uncovers new quantitative relationships on the latent state-space dimension of Causal Neural Operators, which even have new implications for (classical) finite-dimensional Recurrent Neural Networks. In addition, our guarantees for recurrent neural networks are tighter than the available results inherited from feedforward neural networks when approximating dynamical systems between finite-dimensional spaces.

Related papers

A Mathematical Analysis of Neural Operator Behaviors [0.0]
This paper presents a rigorous framework for analyzing the behaviors of neural operators. We focus on their stability, convergence, clustering dynamics, universality, and generalization error. We aim to offer clear and unified guidance in a single setting for the future design of neural operator-based methods.
arXiv Detail & Related papers (2024-10-28T19:38:53Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
The Parametric Complexity of Operator Learning [5.756283466216181]
This paper is to prove that for general classes of operators which are characterized only by their $Cr$- or Lipschitz-regularity, operator learning suffers from a "curse of parametric complexity" The second contribution of the paper is to prove that this general curse can be overcome for solution operators defined by the Hamilton-Jacobi equation. A novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system.
arXiv Detail & Related papers (2023-06-28T05:02:03Z)
A graph convolutional autoencoder approach to model order reduction for parametrized PDEs [0.8192907805418583]
The present work proposes a framework for nonlinear model order reduction based on a Graph Convolutional Autoencoder (GCA-ROM) We develop a non-intrusive and data-driven nonlinear reduction approach, exploiting GNNs to encode the reduced manifold and enable fast evaluations of parametrized PDEs.
arXiv Detail & Related papers (2023-05-15T12:01:22Z)
Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations. DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z)
Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations. We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations. We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Discretization Invariant Networks for Learning Maps between Neural Fields [3.09125960098955]
We present a new framework for understanding and designing discretization invariant neural networks (DI-Nets) Our analysis establishes upper bounds on the deviation in model outputs under different finite discretizations. We prove by construction that DI-Nets universally approximate a large class of maps between integrable function spaces.
arXiv Detail & Related papers (2022-06-02T17:44:03Z)
A Deep Learning approach to Reduced Order Modelling of Parameter Dependent Partial Differential Equations [0.2148535041822524]
We develop a constructive approach based on Deep Neural Networks for the efficient approximation of the parameter-to-solution map. In particular, we consider parametrized advection-diffusion PDEs, and we test the methodology in the presence of strong transport fields.
arXiv Detail & Related papers (2021-03-10T17:01:42Z)
A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs. We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields. Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
Neural Operator: Graph Kernel Network for Partial Differential Equations [57.90284928158383]
This work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators) We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators. Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers.
arXiv Detail & Related papers (2020-03-07T01:56:20Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.