Related papers: Simultaneously Solving FBSDEs with Neural Operators of Logarithmic Depth, Constant Width, and Sub-Linear Rank

Simultaneously Solving FBSDEs with Neural Operators of Logarithmic Depth, Constant Width, and Sub-Linear Rank

URL: http://arxiv.org/abs/2410.14788v1
Date: Fri, 18 Oct 2024 18:01:40 GMT
Title: Simultaneously Solving FBSDEs with Neural Operators of Logarithmic Depth, Constant Width, and Sub-Linear Rank
Authors: Takashi Furuya, Anastasis Kratsios,
Abstract summary: Forward-backwards differential equations (FBSDEs) are central in optimal control, game theory, economics, and mathematical finance. We show that small'' NOs can uniformly approximate the solution operator to structured families of FBSDEs. We also show that convolutional NOs of similar depth, width, and rank can approximate the solution operator to a broad class of Elliptic PDEs.
Score: 8.517406772939292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Forward-backwards stochastic differential equations (FBSDEs) are central in optimal control, game theory, economics, and mathematical finance. Unfortunately, the available FBSDE solvers operate on \textit{individual} FBSDEs, meaning that they cannot provide a computationally feasible strategy for solving large families of FBSDEs as these solvers must be re-run several times. \textit{Neural operators} (NOs) offer an alternative approach for \textit{simultaneously solving} large families of FBSDEs by directly approximating the solution operator mapping \textit{inputs:} terminal conditions and dynamics of the backwards process to \textit{outputs:} solutions to the associated FBSDE. Though universal approximation theorems (UATs) guarantee the existence of such NOs, these NOs are unrealistically large. We confirm that ``small'' NOs can uniformly approximate the solution operator to structured families of FBSDEs with random terminal time, uniformly on suitable compact sets determined by Sobolev norms, to any prescribed error $\varepsilon>0$ using a depth of $\mathcal{O}(\log(1/\varepsilon))$, a width of $\mathcal{O}(1)$, and a sub-linear rank; i.e. $\mathcal{O}(1/\varepsilon^r)$ for some $r<1$. This result is rooted in our second main contribution, which shows that convolutional NOs of similar depth, width, and rank can approximate the solution operator to a broad class of Elliptic PDEs. A key insight here is that the convolutional layers of our NO can efficiently encode the Green's function associated to the Elliptic PDEs linked to our FBSDEs. A byproduct of our analysis is the first theoretical justification for the benefit of lifting channels in NOs: they exponentially decelerate the growth rate of the NO's rank.

Related papers

Neural Expectation Operators [2.1756081703276]
This paper introduces textbfMeasure Learning, a paradigm for modeling ambiguity via non-linear expectations.<n>We define Neural Expectation Operators as solutions to Backward Differential Equations (BSDEs) whose drivers are parameterized by neural networks.<n>We provide constructive methods for enforcing key axiomatic properties, such as convexity, by architectural design.
arXiv Detail & Related papers (2025-07-13T06:19:28Z)
Sunflowers and Ramsey problems for restricted intersections [1.2201929092786905]
We find a variant of F"uredi's celebrated semilattice lemma, which is a key tool in the powerful delta-system method. As an application of our techniques, we also obtain a variant of F"uredi's celebrated semilattice lemma, which is a key tool in the powerful delta-system method.
arXiv Detail & Related papers (2025-04-21T17:46:21Z)
Solving Roughly Forced Nonlinear PDEs via Misspecified Kernel Methods and Neural Networks [3.1895609521267563]
We consider the use of Gaussian Processes (GPs) or Neural Networks (NNs) to numerically approximate the solutions to nonlinear partial differential equations (PDEs)<n>We propose a generalization of these methods to handle roughly forced nonlinear PDEs while preserving convergence guarantees with an oversmoothing GP kernel.<n>This is equivalent to replacing the empirical $L2$-loss on the PDE constraint by an empirical negative-Sobolev norm.
arXiv Detail & Related papers (2025-01-28T17:58:01Z)
Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems. Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z)
Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems. Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications. We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions. Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z)
Deep Equilibrium Based Neural Operators for Steady-State PDEs [100.88355782126098]
We study the benefits of weight-tied neural network architectures for steady-state PDEs. We propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE.
arXiv Detail & Related papers (2023-11-30T22:34:57Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Deep Learning of Delay-Compensated Backstepping for Reaction-Diffusion PDEs [2.2869182375774613]
Multiple operators arise in the control of PDE systems from distinct PDE classes. DeepONet-approximated nonlinear operator is a cascade/composition of the operators defined by one hyperbolic PDE of the Goursat form and one parabolic PDE on a rectangle. For the delay-compensated PDE backstepping controller, we guarantee exponential stability in the $L2$ norm of the plant state and the $H1$ norm of the input delay state.
arXiv Detail & Related papers (2023-08-21T06:42:33Z)
An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z)
Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective [40.964402478629495]
We take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs. Treating a class of PDEs known as emphnonlinear elliptic variational PDEs, our results show neural networks can evade the curse of dimensionality.
arXiv Detail & Related papers (2022-10-21T16:53:18Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization [77.34726150561087]
We develop and analyze DASHA: a new family of methods for noneps distributed optimization problems. Unlike MARINA, the new methods DASHA, DASHA-MVR send compressed vectors only and never synchronize the nodes, which makes them more practical for learning.
arXiv Detail & Related papers (2022-02-02T20:10:40Z)
Physics-Informed Neural Operator for Learning Partial Differential Equations [55.406540167010014]
PINO is the first hybrid approach incorporating data and PDE constraints at different resolutions to learn the operator. The resulting PINO model can accurately approximate the ground-truth solution operator for many popular PDE families.
arXiv Detail & Related papers (2021-11-06T03:41:34Z)
Solving PDEs on Unknown Manifolds with Machine Learning [8.220217498103315]
This paper presents a mesh-free computational framework and machine learning theory for solving elliptic PDEs on unknown manifold. We show that the proposed NN solver can robustly generalize the PDE on new data points with errors that are almost identical to generalizations on new data points.
arXiv Detail & Related papers (2021-06-12T03:55:15Z)
dNNsolve: an efficient NN-based PDE solver [62.997667081978825]
We introduce dNNsolve, that makes use of dual Neural Networks to solve ODEs/PDEs. We show that dNNsolve is capable of solving a broad range of ODEs/PDEs in 1, 2 and 3 spacetime dimensions.
arXiv Detail & Related papers (2021-03-15T19:14:41Z)
Deep Network Approximation for Smooth Functions [9.305095040004156]
We show that deep ReLU networks of width $mathcalO(Nln N)$ and depth $mathcalO(L L)$ can approximate $fin Cs([0,1]d)$ with a nearly optimal approximation error. Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $NinmathbbN+$ and $LinmathbbN+$, respectively.
arXiv Detail & Related papers (2020-01-09T15:06:10Z)
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems [86.92205445270427]
We consider non-con minimax problems, $min_mathbfx max_mathhidoty f(mathbfdoty)$ efficiently.
arXiv Detail & Related papers (2019-06-02T03:03:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.