Related papers: New universal operator approximation theorem for encoder-decoder architectures (Preprint)

New universal operator approximation theorem for encoder-decoder architectures (Preprint)

URL: http://arxiv.org/abs/2503.24092v1
Date: Mon, 31 Mar 2025 13:43:21 GMT
Title: New universal operator approximation theorem for encoder-decoder architectures (Preprint)
Authors: Janek Gödeke, Pascal Fernsel,
Abstract summary: We present a novel universal operator approximation theorem for a broad class of encoder-decoder architectures.<n>In this study, we focus on approximating continuous operators in $mathcalC(mathcalX, mathcalY)$, where $mathcalX$ and $mathcalY$ are infinite-dimensional normed or metric spaces.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motivated by the rapidly growing field of mathematics for operator approximation with neural networks, we present a novel universal operator approximation theorem for a broad class of encoder-decoder architectures. In this study, we focus on approximating continuous operators in $\mathcal{C}(\mathcal{X}, \mathcal{Y})$, where $\mathcal{X}$ and $\mathcal{Y}$ are infinite-dimensional normed or metric spaces, and we consider uniform convergence on compact subsets of $\mathcal{X}$. Unlike standard results in the operator learning literature, we investigate the case where the approximating operator sequence can be chosen independently of the compact sets. Taking a topological perspective, we analyze different types of operator approximation and show that compact-set-independent approximation is a strictly stronger property in most relevant operator learning frameworks. To establish our results, we introduce a new approximation property tailored to encoder-decoder architectures, which enables us to prove a universal operator approximation theorem ensuring uniform convergence on every compact subset. This result unifies and extends existing universal operator approximation theorems for various encoder-decoder architectures, including classical DeepONets, BasisONets, special cases of MIONets, architectures based on frames and other related approaches.

Related papers

Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Approximation Rates in Fréchet Metrics: Barron Spaces, Paley-Wiener Spaces, and Fourier Multipliers [1.4732811715354452]
We study some general approximation capabilities for linear differential operators by approximating the corresponding symbol in the Fourier domain. In that sense, we measure the approximation error in terms of a Fr'echet metric. We then focus on a natural extension of our main theorem, in which we manage to reduce the assumptions on the sequence of semi-norms.
arXiv Detail & Related papers (2024-12-27T20:16:04Z)
Operator Learning of Lipschitz Operators: An Information-Theoretic Perspective [2.375038919274297]
This work addresses the complexity of neural operator approximations for the general class of Lipschitz continuous operators. Our main contribution establishes lower bounds on the metric entropy of Lipschitz operators in two approximation settings. It is shown that, regardless of the activation function used, neural operator architectures attaining an approximation accuracy $epsilon$ must have a size that is exponentially large in $epsilon-1$.
arXiv Detail & Related papers (2024-06-26T23:36:46Z)
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation [53.17668583030862]
We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation. We propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP) We show that LOOP achieves a sublinear $tildemathcalO(mathrmpoly(d, mathrmsp(V*)) sqrtTbeta )$ regret, where $d$ and $beta$ correspond to AGEC and log-covering number of the hypothesis class respectively
arXiv Detail & Related papers (2024-04-19T06:24:22Z)
Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks [4.1764890353794994]
NEON is an architecture for generating predictions with uncertainty using a single operator network backbone. We show that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.
arXiv Detail & Related papers (2024-04-03T22:42:37Z)
The Parametric Complexity of Operator Learning [5.756283466216181]
This paper is to prove that for general classes of operators which are characterized only by their $Cr$- or Lipschitz-regularity, operator learning suffers from a "curse of parametric complexity"<n>The second contribution of the paper is to prove that this general curse can be overcome for solution operators defined by the Hamilton-Jacobi equation.<n>A novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system.
arXiv Detail & Related papers (2023-06-28T05:02:03Z)
Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer [59.73454783958702]
We propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions. In particular, we demonstrate that such an ensemble of join-chains can express a broad subset of ''tree-structured'' first-order logical expressions, named FOET. We find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space.
arXiv Detail & Related papers (2022-10-06T07:39:58Z)
Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders [45.29194877564103]
This work introduces a chart autoencoder with an asymmetric encoding-decoding process that can incorporate additional semi-supervised information such as class labels. We discuss the approximation power of such networks and derive a bound that essentially depends on the intrinsic dimension of the data manifold rather than the dimension of ambient space.
arXiv Detail & Related papers (2022-08-22T19:58:03Z)
On a class of geodesically convex optimization problems solved via Euclidean MM methods [50.428784381385164]
We show how a difference of Euclidean convexization functions can be written as a difference of different types of problems in statistics and machine learning. Ultimately, we helps the broader broader the broader the broader the broader the work.
arXiv Detail & Related papers (2022-06-22T23:57:40Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces. We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator. An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z)
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability. We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z)
Universal Approximation Property of Neural Ordinary Differential Equations [19.861764482790544]
We show that NODEs can form an $Lp$-universal approximator for continuous maps under certain conditions. We also show their stronger approximation property, namely the $sup$-universality for approximating a large class of diffeomorphisms.
arXiv Detail & Related papers (2020-12-04T05:53:21Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.