Related papers: Path Integral Optimiser: Global Optimisation via Neural Schrödinger-Föllmer Diffusion

Path Integral Optimiser: Global Optimisation via Neural Schrödinger-Föllmer Diffusion

URL: http://arxiv.org/abs/2506.06815v1
Date: Sat, 07 Jun 2025 14:46:18 GMT
Title: Path Integral Optimiser: Global Optimisation via Neural Schrödinger-Föllmer Diffusion
Authors: Max McGuinness, Eirik Fladmark, Francisco Vargas,
Abstract summary: We present an early investigation into the use of neural diffusion processes for global optimisation.<n>One can use the Boltzmann distribution to formulate optimization as solving a Schr"odinger bridge sampling problem.<n>We provide theoretical bounds for this optimiser, results on toy tasks, and a summary of the theory motivating the model.
Score: 1.8195082751200438
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an early investigation into the use of neural diffusion processes for global optimisation, focusing on Zhang et al.'s Path Integral Sampler. One can use the Boltzmann distribution to formulate optimization as solving a Schr\"odinger bridge sampling problem, then apply Girsanov's theorem with a simple (single-point) prior to frame it in stochastic control terms, and compute the solution's integral terms via a neural approximation (a Fourier MLP). We provide theoretical bounds for this optimiser, results on toy optimisation tasks, and a summary of the stochastic theory motivating the model. Ultimately, we found the optimiser to display promising per-step performance at optimisation tasks between 2 and 1,247 dimensions, but struggle to explore higher-dimensional spaces when faced with a 15.9k parameter model, indicating a need for work on adaptation in such environments.

Related papers

Faster Sampling via Stochastic Gradient Proximal Sampler [28.422547264326468]
Proximal samplers (SPS) for sampling from non-log-concave distributions are studied. We show that the convergence to the target distribution can be guaranteed as long as the algorithm trajectory is bounded. We provide two implementable variants based on Langevin dynamics (SGLD) and Langevin-MALA, giving rise to SPS-SGLD and SPS-MALA.
arXiv Detail & Related papers (2024-05-27T00:53:18Z)
Proximal Oracles for Optimization and Sampling [18.77973093341588]
We consider convex optimization with non-smooth objective function and log-concave sampling with non-smooth potential. To overcome the challenges caused by non-smoothness, our algorithms employ two powerful proximal frameworks in optimization and sampling.
arXiv Detail & Related papers (2024-04-02T18:52:28Z)
Implicit Diffusion: Efficient Optimization through Stochastic Sampling [46.049117719591635]
We present a new algorithm to optimize distributions defined implicitly by parameterized diffusions.<n>We introduce a general framework for first-order optimization of these processes, that performs jointly.<n>We apply it to training energy-based models and finetuning denoising diffusions.
arXiv Detail & Related papers (2024-02-08T08:00:11Z)
Stein Boltzmann Sampling: A Variational Approach for Global Optimization [6.584366906288068]
We present a flow-based method for global optimization of continuous Sobolev functions, called Stein Boltzmann Sampling (SBS)<n>A detailed comparison with state-of-the-art methods on benchmark functions demonstrates that SBS and its variants are highly competitive.
arXiv Detail & Related papers (2024-02-07T09:28:35Z)
Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models. New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z)
Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and Optimal Algorithms [64.10576998630981]
We show the first tight characterization of the optimal Hessian-dependent sample complexity. A Hessian-independent algorithm universally achieves the optimal sample complexities for all Hessian instances. The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions.
arXiv Detail & Related papers (2023-06-21T17:03:22Z)
Surrogate modeling for Bayesian optimization beyond a single Gaussian process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments [0.0]
We present a novel approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality gradient, is cast into an objective function in the form of an expectation over a Bernoulli distribution. The objective is then solved by using a probabilistic optimization routine to find an optimal observational policy.
arXiv Detail & Related papers (2021-01-15T03:54:12Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.