On the Finite-Time Performance of the Knowledge Gradient Algorithm
- URL: http://arxiv.org/abs/2206.06847v1
- Date: Tue, 14 Jun 2022 13:42:05 GMT
- Title: On the Finite-Time Performance of the Knowledge Gradient Algorithm
- Authors: Yanwen Li and Siyang Gao
- Abstract summary: The knowledge gradient (KG) algorithm is a popular and effective algorithm for the best arm identification (BAI) problem.
We present new theoretical results about the finite-time performance of the KG algorithm.
- Score: 1.713291434132985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The knowledge gradient (KG) algorithm is a popular and effective algorithm
for the best arm identification (BAI) problem. Due to the complex calculation
of KG, theoretical analysis of this algorithm is difficult, and existing
results are mostly about the asymptotic performance of it, e.g., consistency,
asymptotic sample allocation, etc. In this research, we present new theoretical
results about the finite-time performance of the KG algorithm. Under
independent and normally distributed rewards, we derive lower bounds and upper
bounds for the probability of error and simple regret of the algorithm. With
these bounds, existing asymptotic results become simple corollaries. We also
show the performance of the algorithm for the multi-armed bandit (MAB) problem.
These developments not only extend the existing analysis of the KG algorithm,
but can also be used to analyze other improvement-based algorithms. Last, we
use numerical experiments to further demonstrate the finite-time behavior of
the KG algorithm.
Related papers
- Iterative Quantum Algorithms for Maximum Independent Set: A Tale of
Low-Depth Quantum Algorithms [0.0]
We study a new class of hybrid approaches to quantum optimization, termed Iterative Maximum Quantum Algorithms.
We show that for QAOA with depth $p=1$, this algorithm performs exactly the same operations and selections as the classical greedy algorithm for MIS.
arXiv Detail & Related papers (2023-09-22T18:00:03Z) - Dual Algorithmic Reasoning [9.701208207491879]
We propose to learn algorithms by exploiting duality of the underlying algorithmic problem.
We demonstrate that simultaneously learning the dual definition of these optimisation problems in algorithmic learning allows for better learning.
We then validate the real-world utility of our dual algorithmic reasoner by deploying it on a challenging brain vessel classification task.
arXiv Detail & Related papers (2023-02-09T08:46:23Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Polynomial-Time Algorithms for Counting and Sampling Markov Equivalent
DAGs [6.252236971703546]
Counting and uniform sampling of directed acyclic graphs (DAGs) are fundamental in causal analysis.
In this paper, we show that these tasks can be performed from graphical time, solving a long-standing open problem in this area.
Our algorithms are effective and easily implementable.
arXiv Detail & Related papers (2020-12-17T15:47:15Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - The limits of min-max optimization algorithms: convergence to spurious
non-critical sets [82.74514886461257]
min-max optimization algorithms encounter problems far greater because of the existence of periodic cycles and similar phenomena.
We show that there exist algorithms that do not attract any points of the problem.
We illustrate such challenges in simple to almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost
arXiv Detail & Related papers (2020-06-16T10:49:27Z) - Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP)
We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.