Simultaneously Solving FBSDEs with Neural Operators of Logarithmic Depth, Constant Width, and Sub-Linear Rank
- URL: http://arxiv.org/abs/2410.14788v1
- Date: Fri, 18 Oct 2024 18:01:40 GMT
- Title: Simultaneously Solving FBSDEs with Neural Operators of Logarithmic Depth, Constant Width, and Sub-Linear Rank
- Authors: Takashi Furuya, Anastasis Kratsios,
- Abstract summary: Forward-backwards differential equations (FBSDEs) are central in optimal control, game theory, economics, and mathematical finance.
We show that small'' NOs can uniformly approximate the solution operator to structured families of FBSDEs.
We also show that convolutional NOs of similar depth, width, and rank can approximate the solution operator to a broad class of Elliptic PDEs.
- Score: 8.517406772939292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Forward-backwards stochastic differential equations (FBSDEs) are central in optimal control, game theory, economics, and mathematical finance. Unfortunately, the available FBSDE solvers operate on \textit{individual} FBSDEs, meaning that they cannot provide a computationally feasible strategy for solving large families of FBSDEs as these solvers must be re-run several times. \textit{Neural operators} (NOs) offer an alternative approach for \textit{simultaneously solving} large families of FBSDEs by directly approximating the solution operator mapping \textit{inputs:} terminal conditions and dynamics of the backwards process to \textit{outputs:} solutions to the associated FBSDE. Though universal approximation theorems (UATs) guarantee the existence of such NOs, these NOs are unrealistically large. We confirm that ``small'' NOs can uniformly approximate the solution operator to structured families of FBSDEs with random terminal time, uniformly on suitable compact sets determined by Sobolev norms, to any prescribed error $\varepsilon>0$ using a depth of $\mathcal{O}(\log(1/\varepsilon))$, a width of $\mathcal{O}(1)$, and a sub-linear rank; i.e. $\mathcal{O}(1/\varepsilon^r)$ for some $r<1$. This result is rooted in our second main contribution, which shows that convolutional NOs of similar depth, width, and rank can approximate the solution operator to a broad class of Elliptic PDEs. A key insight here is that the convolutional layers of our NO can efficiently encode the Green's function associated to the Elliptic PDEs linked to our FBSDEs. A byproduct of our analysis is the first theoretical justification for the benefit of lifting channels in NOs: they exponentially decelerate the growth rate of the NO's rank.
Related papers
- Sunflowers and Ramsey problems for restricted intersections [1.2201929092786905]
We find a variant of F"uredi's celebrated semilattice lemma, which is a key tool in the powerful delta-system method.
As an application of our techniques, we also obtain a variant of F"uredi's celebrated semilattice lemma, which is a key tool in the powerful delta-system method.
arXiv Detail & Related papers (2025-04-21T17:46:21Z) - Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.
Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems.
Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications.
We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions.
Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - Neural Network Approximations of PDEs Beyond Linearity: A
Representational Perspective [40.964402478629495]
We take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs.
Treating a class of PDEs known as emphnonlinear elliptic variational PDEs, our results show neural networks can evade the curse of dimensionality.
arXiv Detail & Related papers (2022-10-21T16:53:18Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - DASHA: Distributed Nonconvex Optimization with Communication
Compression, Optimal Oracle Complexity, and No Client Synchronization [77.34726150561087]
We develop and analyze DASHA: a new family of methods for noneps distributed optimization problems.
Unlike MARINA, the new methods DASHA, DASHA-MVR send compressed vectors only and never synchronize the nodes, which makes them more practical for learning.
arXiv Detail & Related papers (2022-02-02T20:10:40Z) - Deep Network Approximation for Smooth Functions [9.305095040004156]
We show that deep ReLU networks of width $mathcalO(Nln N)$ and depth $mathcalO(L L)$ can approximate $fin Cs([0,1]d)$ with a nearly optimal approximation error.
Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $NinmathbbN+$ and $LinmathbbN+$, respectively.
arXiv Detail & Related papers (2020-01-09T15:06:10Z) - On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems [86.92205445270427]
We consider non-con minimax problems, $min_mathbfx max_mathhidoty f(mathbfdoty)$ efficiently.
arXiv Detail & Related papers (2019-06-02T03:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.