End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes
- URL: http://arxiv.org/abs/2305.15930v4
- Date: Fri, 22 Dec 2023 09:38:26 GMT
- Title: End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes
- Authors: Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Haitham Bou Ammar
- Abstract summary: This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
- Score: 52.818579746354665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-Bayesian optimisation (meta-BO) aims to improve the sample efficiency of
Bayesian optimisation by leveraging data from related tasks. While previous
methods successfully meta-learn either a surrogate model or an acquisition
function independently, joint training of both components remains an open
challenge. This paper proposes the first end-to-end differentiable meta-BO
framework that generalises neural processes to learn acquisition functions via
transformer architectures. We enable this end-to-end framework with
reinforcement learning (RL) to tackle the lack of labelled acquisition data.
Early on, we notice that training transformer-based neural processes from
scratch with RL is challenging due to insufficient supervision, especially when
rewards are sparse. We formalise this claim with a combinatorial analysis
showing that the widely used notion of regret as a reward signal exhibits a
logarithmic sparsity pattern in trajectory lengths. To tackle this problem, we
augment the RL objective with an auxiliary task that guides part of the
architecture to learn a valid probabilistic model as an inductive bias. We
demonstrate that our method achieves state-of-the-art regret results against
various baselines in experiments on standard hyperparameter optimisation tasks
and also outperforms others in the real-world problems of mixed-integer
programming tuning, antibody design, and logic synthesis for electronic design
automation.
Related papers
- Reinforcement Learning as an Improvement Heuristic for Real-World Production Scheduling [0.0]
One promising approach is to train an RL agent as an improvement, starting with a suboptimal solution that is iteratively improved by applying small changes.
We apply this approach to a real-world multiobjective production scheduling problem.
We benchmarked our approach against other approaches using real data from our industry partner, demonstrating its superior performance.
arXiv Detail & Related papers (2024-09-18T12:48:56Z) - Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts [33.58165081033569]
We introduce Sparse MetA-Tuning (SMAT), a method inspired by sparse mixture-of-experts approaches.
SMAT successfully overcomes OOD sensitivity and delivers on the promise of enhancing the transfer abilities of vision foundation models.
arXiv Detail & Related papers (2024-03-13T12:46:03Z) - Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so.
We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed.
Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z) - Transformer-based Planning for Symbolic Regression [18.90700817248397]
We propose TPSR, a Transformer-based Planning strategy for Symbolic Regression.
Unlike conventional decoding strategies, TPSR enables the integration of non-differentiable feedback, such as fitting accuracy and complexity.
Our approach outperforms state-of-the-art methods, enhancing the model's fitting-complexity trade-off, Symbolic abilities, and robustness to noise.
arXiv Detail & Related papers (2023-03-13T03:29:58Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - A Hybrid Framework for Sequential Data Prediction with End-to-End
Optimization [0.0]
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates hand-designed features and manual model selection issues.
We employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression.
We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets.
arXiv Detail & Related papers (2022-03-25T17:13:08Z) - An Optimization-Based Meta-Learning Model for MRI Reconstruction with
Diverse Dataset [4.9259403018534496]
We develop a generalizable MRI reconstruction model in the meta-learning framework.
The proposed network learns regularization function in a learner adaptional model.
We test the result of quick training on the unseen tasks after meta-training and in the saving half of the time.
arXiv Detail & Related papers (2021-10-02T03:21:52Z) - Machine Learning Framework for Quantum Sampling of Highly-Constrained,
Continuous Optimization Problems [101.18253437732933]
We develop a generic, machine learning-based framework for mapping continuous-space inverse design problems into surrogate unconstrained binary optimization problems.
We showcase the framework's performance on two inverse design problems by optimizing thermal emitter topologies for thermophotovoltaic applications and (ii) diffractive meta-gratings for highly efficient beam steering.
arXiv Detail & Related papers (2021-05-06T02:22:23Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.