Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit
Problems
- URL: http://arxiv.org/abs/2008.07386v1
- Date: Mon, 17 Aug 2020 14:53:17 GMT
- Title: Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit
Problems
- Authors: Fabio Massimo Zennaro, Audun J{\o}sang
- Abstract summary: We consider the formalism of subjective logic, a concise and expressive framework to express Dirichlet-multinomial models as subjective opinions.
We propose new algorithms grounded in subjective logic to tackle the multi-armed bandit problem.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The multi-armed bandit problem is a classical decision-making problem where
an agent has to learn an optimal action balancing exploration and exploitation.
Properly managing this trade-off requires a correct assessment of uncertainty;
in multi-armed bandits, as in other machine learning applications, it is
important to distinguish between stochasticity that is inherent to the system
(aleatoric uncertainty) and stochasticity that derives from the limited
knowledge of the agent (epistemic uncertainty). In this paper we consider the
formalism of subjective logic, a concise and expressive framework to express
Dirichlet-multinomial models as subjective opinions, and we apply it to the
problem of multi-armed bandits. We propose new algorithms grounded in
subjective logic to tackle the multi-armed bandit problem, we compare them
against classical algorithms from the literature, and we analyze the insights
they provide in evaluating the dynamics of uncertainty. Our preliminary results
suggest that subjective logic quantities enable useful assessment of
uncertainty that may be exploited by more refined agents.
Related papers
- Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward.
We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy.
We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z) - Concise and interpretable multi-label rule sets [13.416159628299779]
We develop a multi-label classifier that can be represented as a concise set of simple "if-then" rules.
Our method is able to find a small set of relevant patterns that lead to accurate multi-label classification.
arXiv Detail & Related papers (2022-10-04T11:23:50Z) - On the Complexity of Adversarial Decision Making [101.14158787665252]
We show that the Decision-Estimation Coefficient is necessary and sufficient to obtain low regret for adversarial decision making.
We provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures.
arXiv Detail & Related papers (2022-06-27T06:20:37Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Ensemble-based Uncertainty Quantification: Bayesian versus Credal
Inference [0.0]
We consider ensemble-based approaches to uncertainty quantification.
We specifically focus on Bayesian methods and approaches based on so-called credal sets.
The effectiveness of corresponding measures is evaluated and compared in an empirical study on classification with a reject option.
arXiv Detail & Related papers (2021-07-21T22:47:24Z) - Confidence-Budget Matching for Sequential Budgeted Learning [69.77435313099366]
We formalize decision-making problems with querying budget.
We consider multi-armed bandits, linear bandits, and reinforcement learning problems.
We show that CBM based algorithms perform well in the presence of adversity.
arXiv Detail & Related papers (2021-02-05T19:56:31Z) - An empirical evaluation of active inference in multi-armed bandits [0.0]
The active inference framework is distinguished by its sophisticated strategy for resolving the exploration-exploitation trade-off.
We derive an efficient and approximate scalable active inference algorithm and compare it to two state-of-the-art bandit algorithms.
arXiv Detail & Related papers (2021-01-21T16:20:06Z) - Latent Bandits Revisited [55.88616813182679]
A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.
We propose general algorithms for this setting, based on both upper confidence bounds (UCBs) and Thompson sampling.
We provide a unified theoretical analysis of our algorithms, which have lower regret than classic bandit policies when the number of latent states is smaller than actions.
arXiv Detail & Related papers (2020-06-15T19:24:02Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.