Related papers: A Survey of Risk-Aware Multi-Armed Bandits

A Survey of Risk-Aware Multi-Armed Bandits

URL: http://arxiv.org/abs/2205.05843v1
Date: Thu, 12 May 2022 02:20:34 GMT
Title: A Survey of Risk-Aware Multi-Armed Bandits
Authors: Vincent Y. F. Tan and Prashanth L.A. and Krishna Jagannathan
Abstract summary: We review various risk measures of interest, and comment on their properties. We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests. We conclude by commenting on persisting challenges and fertile areas for future research.
Score: 84.67376599822569
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In several applications such as clinical trials and financial portfolio optimization, the expected value (or the average reward) does not satisfactorily capture the merits of a drug or a portfolio. In such applications, risk plays a crucial role, and a risk-aware performance measure is preferable, so as to capture losses in the case of adverse events. This survey aims to consolidate and summarise the existing research on risk measures, specifically in the context of multi-armed bandits. We review various risk measures of interest, and comment on their properties. Next, we review existing concentration inequalities for various risk measures. Then, we proceed to defining risk-aware bandit problems, We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests, as well as the best-arm identification setting, which is a pure exploration problem -- both in the context of risk-sensitive measures. We conclude by commenting on persisting challenges and fertile areas for future research.

Related papers

Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents. We identify six key features of LLM-based agents, based on which we summarize the current research progress. We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z)
SafeAR: Safe Algorithmic Recourse by Risk-Aware Policies [2.291948092032746]
We present a method to compute recourse policies that consider variability in cost. We show how existing recourse desiderata can fail to capture the risk of higher costs.
arXiv Detail & Related papers (2023-08-23T18:12:11Z)
Eliciting Risk Aversion with Inverse Reinforcement Learning via Interactive Questioning [0.0]
This paper proposes a novel framework for identifying an agent's risk aversion using interactive questioning. We prove that the agent's risk aversion can be identified as the number of questions tends to infinity, and the questions are randomly designed. Our framework has important applications in robo-advising and provides a new approach for identifying an agent's risk preferences.
arXiv Detail & Related papers (2023-08-16T15:17:57Z)
Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures [23.46659319363579]
We present two model-based algorithms applied to emphLipschitz dynamic risk measures. Notably, our upper bounds demonstrate optimal dependencies on the number of actions and episodes.
arXiv Detail & Related papers (2023-06-04T16:24:19Z)
Risk-aware linear bandits with convex loss [0.0]
We propose an optimistic UCB algorithm to learn optimal risk-aware actions, with regret guarantees similar to those of generalized linear bandits. This approach requires solving a convex problem at each round of the algorithm, which we can relax by allowing only approximated solution obtained by online gradient descent.
arXiv Detail & Related papers (2022-09-15T09:09:53Z)
Risk Perspective Exploration in Distributional Reinforcement Learning [10.441880303257468]
We present risk scheduling approaches that explore risk levels and optimistic behaviors from a risk perspective. We demonstrate the performance enhancement of the DMIX algorithm using risk scheduling in a multi-agent setting.
arXiv Detail & Related papers (2022-06-28T17:37:34Z)
Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z)
Off-Policy Evaluation of Slate Policies under Bayes Risk [70.10677881866047]
We study the problem of off-policy evaluation for slate bandits, for the typical case in which the logging policy factorizes over the slots of the slate. We show that the risk improvement over PI grows linearly with the number of slots, and linearly with the gap between the arithmetic and the harmonic mean of a set of slot-level divergences.
arXiv Detail & Related papers (2021-01-05T20:07:56Z)
Risk-Constrained Thompson Sampling for CVaR Bandits [82.47796318548306]
We consider a popular risk measure in quantitative finance known as the Conditional Value at Risk (CVaR) We explore the performance of a Thompson Sampling-based algorithm CVaR-TS under this risk measure.
arXiv Detail & Related papers (2020-11-16T15:53:22Z)
Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss. We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.