Related papers: Nonmyopic Gaussian Process Optimization with Macro-Actions

Nonmyopic Gaussian Process Optimization with Macro-Actions

URL: http://arxiv.org/abs/2002.09670v1
Date: Sat, 22 Feb 2020 09:56:20 GMT
Title: Nonmyopic Gaussian Process Optimization with Macro-Actions
Authors: Dmitrii Kharkovskii, Chun Kai Ling, Kian Hsiang Low
Abstract summary: This paper presents a multi-staged approach to nonmyopic adaptive Gaussian process optimization (GPO) It exploits the notion of macro-actions for scaling up to a further lookahead to match up to a larger available budget. We empirically evaluate the performance of our epsilon-Macro-GPO policy and its anytime variant in BO datasets with synthetic and real-world datasets.
Score: 13.847308344546171
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a multi-staged approach to nonmyopic adaptive Gaussian process optimization (GPO) for Bayesian optimization (BO) of unknown, highly complex objective functions that, in contrast to existing nonmyopic adaptive BO algorithms, exploits the notion of macro-actions for scaling up to a further lookahead to match up to a larger available budget. To achieve this, we generalize GP upper confidence bound to a new acquisition function defined w.r.t. a nonmyopic adaptive macro-action policy, which is intractable to be optimized exactly due to an uncountable set of candidate outputs. The contribution of our work here is thus to derive a nonmyopic adaptive epsilon-Bayes-optimal macro-action GPO (epsilon-Macro-GPO) policy. To perform nonmyopic adaptive BO in real time, we then propose an asymptotically optimal anytime variant of our epsilon-Macro-GPO policy with a performance guarantee. We empirically evaluate the performance of our epsilon-Macro-GPO policy and its anytime variant in BO with synthetic and real-world datasets.

Related papers

Nonmyopic Global Optimisation via Approximate Dynamic Programming [14.389086937116582]
We introduce novel nonmyopic acquisition strategies tailored to IDW- and RBF-based global optimisation. Specifically, we develop dynamic programming-based paradigms, including rollout and multi-step scenario-based optimisation schemes.
arXiv Detail & Related papers (2024-12-06T09:25:00Z)
Generalized Preference Optimization: A Unified Approach to Offline Alignment [54.97015778517253]
We propose generalized preference optimization (GPO), a family of offline losses parameterized by a general class of convex functions. GPO enables a unified view over preference optimization, encompassing existing algorithms such as DPO, IPO and SLiC as special cases. Our results present new algorithmic toolkits and empirical insights to alignment practitioners.
arXiv Detail & Related papers (2024-02-08T15:33:09Z)
Towards Efficient Exact Optimization of Language Model Alignment [93.39181634597877]
Direct preference optimization (DPO) was proposed to directly optimize the policy from preference data. We show that DPO derived based on the optimal solution of problem leads to a compromised mean-seeking approximation of the optimal solution in practice. We propose efficient exact optimization (EXO) of the alignment objective.
arXiv Detail & Related papers (2024-02-01T18:51:54Z)
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies [0.0]
Policy optimization is the emphde facto paradigm to adapt robot policies as a function of task-specific objectives. We propose to leverage the structure of probabilistic policies by casting the policy optimization as an optimal transport problem. We evaluate our approach on common robotic settings: reaching motions, collision-avoidance behaviors, and multi-goal tasks.
arXiv Detail & Related papers (2023-05-17T17:48:24Z)
Bayesian Optimization for Macro Placement [48.55456716632735]
We develop a novel approach to macro placement using Bayesian optimization (BO) over sequence pairs. BO is a machine learning technique that uses a probabilistic surrogate model and an acquisition function. We demonstrate our algorithm on the fixed-outline macro placement problem with the half-perimeter wire length objective.
arXiv Detail & Related papers (2022-07-18T06:17:06Z)
Surrogate modeling for Bayesian optimization beyond a single Gaussian process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z)
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs [113.8752163061151]
We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision processes (MDPs) We propose the underlineperiodically underlinerestarted underlineoptimistic underlinepolicy underlineoptimization algorithm (PROPO) PROPO features two mechanisms: sliding-window-based policy evaluation and periodic-restart-based policy improvement.
arXiv Detail & Related papers (2021-10-18T02:33:20Z)
Bayesian Optimization of Risk Measures [7.799648230758491]
We consider Bayesian optimization of objective functions of the form $rho[ F(x, W) ]$, where $F$ is a black-box expensive-to-evaluate function. We propose a family of novel Bayesian optimization algorithms that exploit the structure of the objective function to substantially improve sampling efficiency.
arXiv Detail & Related papers (2020-07-10T18:20:46Z)
BOSH: Bayesian Optimization by Sampling Hierarchically [10.10241176664951]
We propose a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations. We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper- parameter tuning tasks.
arXiv Detail & Related papers (2020-07-02T07:35:49Z)
Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions. Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z)
Provably Efficient Exploration in Policy Optimization [117.09887790160406]
This paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO) OPPO achieves $tildeO(sqrtd2 H3 T )$ regret. To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores.
arXiv Detail & Related papers (2019-12-12T08:40:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.