Related papers: Stopping Bayesian Optimization with Probabilistic Regret Bounds

Related papers

Principled Preferential Bayesian Optimization [22.269732173306192]
We study the problem of preferential Bayesian optimization (BO) We aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. An optimistic algorithm with an efficient computational method is then developed to solve the problem.
arXiv Detail & Related papers (2024-02-08T02:57:47Z)
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation [22.758674468435302]
In a heavy-tailed noise regime, the difference between the gradient and the true rate is assumed to have a finite $p-th moment. This paper provides a comprehensive analysis of nonsmooth convex optimization with heavy-tailed noises.
arXiv Detail & Related papers (2023-03-22T03:05:28Z)
Estimating Optimal Policy Value in General Linear Contextual Bandits [50.008542459050155]
In many bandit problems, the maximal reward achievable by a policy is often unknown in advance. We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable. We present a more practical, computationally efficient algorithm that estimates a problem-dependent upper bound on $V*$.
arXiv Detail & Related papers (2023-02-19T01:09:24Z)
Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm. Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z)
Bayesian Optimization with Conformal Prediction Sets [44.565812181545645]
Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models. We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity. In many cases we find that query coverage can be significantly improved without harming sample-efficiency.
arXiv Detail & Related papers (2022-10-22T17:01:05Z)
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy. Many algorithms for IRL have an inherently nested structure. We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z)
A Near-Optimal Algorithm for Univariate Zeroth-Order Budget Convex Optimization [4.608510640547952]
We prove near-optimal optimization error guarantees for Dy Search. We show that the classical dependence on the global Lipschitz constant in the error bounds is an artifact of the granularity of the budget.
arXiv Detail & Related papers (2022-08-13T19:57:04Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem. We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS) We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z)
Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used. Second, to explain these findings we introduce the concept of committal rate for policy optimization. Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z)
Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes. Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z)
On the Optimality of Batch Policy Optimization Algorithms [106.89498352537682]
Batch policy optimization considers leveraging existing data for policy construction before interacting with an environment. We show that any confidence-adjusted index algorithm is minimax optimal, whether it be optimistic, pessimistic or neutral. We introduce a new weighted-minimax criterion that considers the inherent difficulty of optimal value prediction.
arXiv Detail & Related papers (2021-04-06T05:23:20Z)
Conservative Stochastic Optimization with Expectation Constraints [11.393603788068777]
This paper considers convex optimization problems where the objective and constraint functions involve expectations with respect to the data indices or environmental variables. Online and efficient approaches for solving such problems have not been widely studied. We propose a novel conservative optimization algorithm (CSOA) that achieves zero constraint violation and $Oleft(T-frac12right)$ optimality gap.
arXiv Detail & Related papers (2020-08-13T08:56:24Z)
Necessary and Sufficient Conditions for Inverse Reinforcement Learning of Bayesian Stopping Time Problems [22.498689292081156]
This paper presents an inverse reinforcement learning(IRL) framework for Bayesian stopping time problems. By observing the actions of a Bayesian decision maker, we provide a necessary and sufficient condition to identify if these actions are consistent with optimizing a cost function. Our IRL algorithm identifies optimality and then constructs set valued estimates of the cost function.
arXiv Detail & Related papers (2020-07-07T14:14:12Z)
Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations [54.42518331209581]
We find an algorithm which finds. epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using. $(epsilon,gamma)$surimate random random points. Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
A New Randomized Primal-Dual Algorithm for Convex Optimization with Optimal Last Iterate Rates [16.54912614895861]
We develop a novel unified randomized block-coordinate primal-dual algorithm to solve a class of nonsmooth constrained convex optimization problems. We prove that our algorithm achieves optimal convergence rates in two cases: general convexity and strong convexity. Our results show that the proposed method has encouraging performance on different experiments.
arXiv Detail & Related papers (2020-03-03T03:59:26Z)
Optimal $\delta$-Correct Best-Arm Selection for Heavy-Tailed Distributions [2.2940141855172036]
We consider the problem of identifying the one with the maximum mean using a $delta$-correct algorithm. Lower bounds for $delta$-correct algorithms are well known. We propose a $delta$-correct algorithm that matches the lower bound as $delta$ reduces to zero.
arXiv Detail & Related papers (2019-08-24T05:31:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.