Nonmyopic Gaussian Process Optimization with Macro-Actions
- URL: http://arxiv.org/abs/2002.09670v1
- Date: Sat, 22 Feb 2020 09:56:20 GMT
- Title: Nonmyopic Gaussian Process Optimization with Macro-Actions
- Authors: Dmitrii Kharkovskii, Chun Kai Ling, Kian Hsiang Low
- Abstract summary: This paper presents a multi-staged approach to nonmyopic adaptive Gaussian process optimization (GPO)
It exploits the notion of macro-actions for scaling up to a further lookahead to match up to a larger available budget.
We empirically evaluate the performance of our epsilon-Macro-GPO policy and its anytime variant in BO datasets with synthetic and real-world datasets.
- Score: 13.847308344546171
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a multi-staged approach to nonmyopic adaptive Gaussian
process optimization (GPO) for Bayesian optimization (BO) of unknown, highly
complex objective functions that, in contrast to existing nonmyopic adaptive BO
algorithms, exploits the notion of macro-actions for scaling up to a further
lookahead to match up to a larger available budget. To achieve this, we
generalize GP upper confidence bound to a new acquisition function defined
w.r.t. a nonmyopic adaptive macro-action policy, which is intractable to be
optimized exactly due to an uncountable set of candidate outputs. The
contribution of our work here is thus to derive a nonmyopic adaptive
epsilon-Bayes-optimal macro-action GPO (epsilon-Macro-GPO) policy. To perform
nonmyopic adaptive BO in real time, we then propose an asymptotically optimal
anytime variant of our epsilon-Macro-GPO policy with a performance guarantee.
We empirically evaluate the performance of our epsilon-Macro-GPO policy and its
anytime variant in BO with synthetic and real-world datasets.
Related papers
- Generalized Preference Optimization: A Unified Approach to Offline Alignment [54.97015778517253]
We propose generalized preference optimization (GPO), a family of offline losses parameterized by a general class of convex functions.
GPO enables a unified view over preference optimization, encompassing existing algorithms such as DPO, IPO and SLiC as special cases.
Our results present new algorithmic toolkits and empirical insights to alignment practitioners.
arXiv Detail & Related papers (2024-02-08T15:33:09Z) - Towards Efficient Exact Optimization of Language Model Alignment [93.39181634597877]
Direct preference optimization (DPO) was proposed to directly optimize the policy from preference data.
We show that DPO derived based on the optimal solution of problem leads to a compromised mean-seeking approximation of the optimal solution in practice.
We propose efficient exact optimization (EXO) of the alignment objective.
arXiv Detail & Related papers (2024-02-01T18:51:54Z) - Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies [0.0]
Policy optimization is the emphde facto paradigm to adapt robot policies as a function of task-specific objectives.
We propose to leverage the structure of probabilistic policies by casting the policy optimization as an optimal transport problem.
We evaluate our approach on common robotic settings: reaching motions, collision-avoidance behaviors, and multi-goal tasks.
arXiv Detail & Related papers (2023-05-17T17:48:24Z) - Bayesian Optimization for Macro Placement [48.55456716632735]
We develop a novel approach to macro placement using Bayesian optimization (BO) over sequence pairs.
BO is a machine learning technique that uses a probabilistic surrogate model and an acquisition function.
We demonstrate our algorithm on the fixed-outline macro placement problem with the half-perimeter wire length objective.
arXiv Detail & Related papers (2022-07-18T06:17:06Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Bayesian Optimization of Risk Measures [7.799648230758491]
We consider Bayesian optimization of objective functions of the form $rho[ F(x, W) ]$, where $F$ is a black-box expensive-to-evaluate function.
We propose a family of novel Bayesian optimization algorithms that exploit the structure of the objective function to substantially improve sampling efficiency.
arXiv Detail & Related papers (2020-07-10T18:20:46Z) - BOSH: Bayesian Optimization by Sampling Hierarchically [10.10241176664951]
We propose a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations.
We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper- parameter tuning tasks.
arXiv Detail & Related papers (2020-07-02T07:35:49Z) - Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations.
We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions.
Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z) - Provably Efficient Exploration in Policy Optimization [117.09887790160406]
This paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO)
OPPO achieves $tildeO(sqrtd2 H3 T )$ regret.
To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores.
arXiv Detail & Related papers (2019-12-12T08:40:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.