Thompson Sampling in Function Spaces via Neural Operators
- URL: http://arxiv.org/abs/2506.21894v1
- Date: Fri, 27 Jun 2025 04:21:57 GMT
- Title: Thompson Sampling in Function Spaces via Neural Operators
- Authors: Rafael Oliveira, Xuesong Wang, Kian Ming A. Chai, Edwin V. Bonilla,
- Abstract summary: We propose an extension of Thompson sampling to optimization problems over function spaces where the objective is a known functional of an unknown operator's output.<n>Our algorithm employs a sample-then-optimize approach using neural operator surrogates.
- Score: 14.0301500809197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an extension of Thompson sampling to optimization problems over function spaces where the objective is a known functional of an unknown operator's output. We assume that functional evaluations are inexpensive, while queries to the operator (such as running a high-fidelity simulator) are costly. Our algorithm employs a sample-then-optimize approach using neural operator surrogates. This strategy avoids explicit uncertainty quantification by treating trained neural operators as approximate samples from a Gaussian process. We provide novel theoretical convergence guarantees, based on Gaussian processes in the infinite-dimensional setting, under minimal assumptions. We benchmark our method against existing baselines on functional optimization tasks involving partial differential equations and other nonlinear operator-driven phenomena, demonstrating improved sample efficiency and competitive performance.
Related papers
- Explicit and Implicit Graduated Optimization in Deep Neural Networks [0.6906005491572401]
This paper experimentally evaluates the performance of an explicit graduated optimization algorithm with an optimal noise scheduling.<n>In addition, it demonstrates its effectiveness through experiments on image classification tasks with ResNet architectures.
arXiv Detail & Related papers (2024-12-16T07:23:22Z) - Sample-efficient Bayesian Optimisation Using Known Invariances [56.34916328814857]
We show that vanilla and constrained BO algorithms are inefficient when optimising invariant objectives.
We derive a bound on the maximum information gain of these invariant kernels.
We use our method to design a current drive system for a nuclear fusion reactor, finding a high-performance solution.
arXiv Detail & Related papers (2024-10-22T12:51:46Z) - Operator Learning Using Random Features: A Tool for Scientific Computing [3.745868534225104]
Supervised operator learning centers on the use of training data to estimate maps between infinite-dimensional spaces.
This paper introduces the function-valued random features method.
It leads to a supervised operator learning architecture that is practical for nonlinear problems.
arXiv Detail & Related papers (2024-08-12T23:10:39Z) - Linearization Turns Neural Operators into Function-Valued Gaussian Processes [23.85470417458593]
We introduce LUNO, a novel framework for approximate Bayesian uncertainty quantification in trained neural operators.<n>Our approach leverages model linearization to push (Gaussian) weight-space uncertainty forward to the neural operator's predictions.<n>We show that this can be interpreted as a probabilistic version of the concept of currying from functional programming, yielding a function-valued (Gaussian) random process belief.
arXiv Detail & Related papers (2024-06-07T16:43:54Z) - Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks [4.1764890353794994]
NEON is an architecture for generating predictions with uncertainty using a single operator network backbone.
We show that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.
arXiv Detail & Related papers (2024-04-03T22:42:37Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Sequential Subspace Search for Functional Bayesian Optimization
Incorporating Experimenter Intuition [63.011641517977644]
Our algorithm generates a sequence of finite-dimensional random subspaces of functional space spanned by a set of draws from the experimenter's Gaussian Process.
Standard Bayesian optimisation is applied on each subspace, and the best solution found used as a starting point (origin) for the next subspace.
We test our algorithm in simulated and real-world experiments, namely blind function matching, finding the optimal precipitation-strengthening function for an aluminium alloy, and learning rate schedule optimisation for deep networks.
arXiv Detail & Related papers (2020-09-08T06:54:11Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.