Efficient and Interpretable Bandit Algorithms
- URL: http://arxiv.org/abs/2310.14751v2
- Date: Thu, 8 Feb 2024 22:37:36 GMT
- Title: Efficient and Interpretable Bandit Algorithms
- Authors: Subhojyoti Mukherjee, Ruihao Zhu, Branislav Kveton
- Abstract summary: A bandit algorithm is interpretable if it explores with the objective of reducing uncertainty in the unknown model parameter.
We propose CODE, a bandit algorithm based on a Constrained Optimal DEsign, that is interpretable and maximally reduces the uncertainty.
- Score: 18.99853072645046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by the importance of explainability in modern machine learning, we
design bandit algorithms that are efficient and interpretable. A bandit
algorithm is interpretable if it explores with the objective of reducing
uncertainty in the unknown model parameter. To quantify the interpretability,
we introduce a novel metric of model error, which compares the rate reduction
of the mean reward estimates to their actual means among all the plausible
actions. We propose CODE, a bandit algorithm based on a Constrained Optimal
DEsign, that is interpretable and maximally reduces the uncertainty. The key
idea in CODE is to explore among all plausible actions, determined by a
statistical constraint, to achieve interpretability. We implement CODE
efficiently in both multi-armed and linear bandits and derive near-optimal
regret bounds by leveraging the optimality criteria of the approximate optimal
design. CODE can be also viewed as removing phases in conventional phased
elimination, which makes it more practical and general. We demonstrate the
advantage of CODE by numerical experiments on both synthetic and real-world
problems. CODE outperforms other state-of-the-art interpretable designs while
matching the performance of popular but uninterpretable designs, such as upper
confidence bound algorithms.
Related papers
- Enhancing Performance of Explainable AI Models with Constrained Concept Refinement [10.241134756773228]
Trade-off between accuracy and interpretability has long been a challenge in machine learning (ML)
In this paper, we investigate the impact of deviations in concept representations and propose a novel framework to mitigate these effects.
Compared to existing explainable methods, our approach not only improves prediction accuracy while preserving model interpretability across various large-scale benchmarks but also achieves this with significantly lower computational cost.
arXiv Detail & Related papers (2025-02-10T18:53:15Z) - UCB algorithms for multi-armed bandits: Precise regret and adaptive inference [6.349503549199403]
Upper Confidence Bound (UCB) algorithms are a widely-used class of sequential algorithms for the $K$-armed bandit problem.
We show that the Lai-Robbins regret formula is exact if and only if the sub-optimality gaps exceed the order $sigmasqrtKlog T/T$.
We also show that its maximal regret deviates from the minimax regret by a logarithmic factor, and therefore settling its strict minimax optimality in the negative.
arXiv Detail & Related papers (2024-12-09T01:14:02Z) - Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience.
We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Revisiting and Advancing Fast Adversarial Training Through The Lens of
Bi-Level Optimization [60.72410937614299]
We propose a new tractable bi-level optimization problem, design and analyze a new set of algorithms termed Bi-level AT (FAST-BAT)
FAST-BAT is capable of defending sign-based projected descent (PGD) attacks without calling any gradient sign method and explicit robust regularization.
arXiv Detail & Related papers (2021-12-23T06:25:36Z) - Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem.
We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS)
We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Adaptive Discretization for Model-Based Reinforcement Learning [10.21634042036049]
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm.
Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space.
arXiv Detail & Related papers (2020-07-01T19:36:46Z) - Interpretable Random Forests via Rule Extraction [0.0]
We introduce SIRUS, a stable rule learning algorithm which takes the form of a short and simple list of rules.
Our R/C++ software implementation sirus is available from CRAN.
arXiv Detail & Related papers (2020-04-29T08:13:35Z) - Bandit algorithms to emulate human decision making using probabilistic
distortions [20.422725678982726]
We formulate two sample multi-armed bandit problems with distorted probabilities on the reward distributions.
We consider the aforementioned problems in the regret minimization as well as best arm identification framework for multi-armed bandits.
arXiv Detail & Related papers (2016-11-30T17:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.