Distinguishing Risk Preferences using Repeated Gambles
- URL: http://arxiv.org/abs/2308.07054v1
- Date: Mon, 14 Aug 2023 10:27:58 GMT
- Title: Distinguishing Risk Preferences using Repeated Gambles
- Authors: James Price, Colm Connaughton
- Abstract summary: Sequences of repeated gambles provide an experimental tool to characterize risk preferences of humans or artificial decision-making agents.
We show that it becomes increasingly difficult to distinguish the risk preferences of agents as their wealth increases.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sequences of repeated gambles provide an experimental tool to characterize
the risk preferences of humans or artificial decision-making agents. The
difficulty of this inference depends on factors including the details of the
gambles offered and the number of iterations of the game played. In this paper
we explore in detail the practical challenges of inferring risk preferences
from the observed choices of artificial agents who are presented with finite
sequences of repeated gambles. We are motivated by the fact that the strategy
to maximize long-run wealth for sequences of repeated additive gambles (where
gains and losses are independent of current wealth) is different to the
strategy for repeated multiplicative gambles (where gains and losses are
proportional to current wealth.) Accurate measurement of risk preferences would
be needed to tell whether an agent is employing the optimal strategy or not. To
generalize the types of gambles our agents face we use the Yeo-Johnson
transformation, a tool borrowed from feature engineering for time series
analysis, to construct a family of gambles that interpolates smoothly between
the additive and multiplicative cases. We then analyze the optimal strategy for
this family, both analytically and numerically. We find that it becomes
increasingly difficult to distinguish the risk preferences of agents as their
wealth increases. This is because agents with different risk preferences
eventually make the same decisions for sufficiently high wealth. We believe
that these findings are informative for the effective design of experiments to
measure risk preferences in humans.
Related papers
- Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Eliciting Risk Aversion with Inverse Reinforcement Learning via
Interactive Questioning [0.0]
This paper proposes a novel framework for identifying an agent's risk aversion using interactive questioning.
We prove that the agent's risk aversion can be identified as the number of questions tends to infinity, and the questions are randomly designed.
Our framework has important applications in robo-advising and provides a new approach for identifying an agent's risk preferences.
arXiv Detail & Related papers (2023-08-16T15:17:57Z) - A Survey of Risk-Aware Multi-Armed Bandits [84.67376599822569]
We review various risk measures of interest, and comment on their properties.
We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests.
We conclude by commenting on persisting challenges and fertile areas for future research.
arXiv Detail & Related papers (2022-05-12T02:20:34Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Risk Preferences of Learning Algorithms [0.0]
We show that a widely used learning algorithm, $varepsilon$-Greedy, exhibits emergent risk aversion.
We discuss two methods to correct this bias.
arXiv Detail & Related papers (2022-05-10T01:30:24Z) - Two steps to risk sensitivity [4.974890682815778]
conditional value-at-risk (CVaR) is a risk measure for modeling human and animal planning.
We adopt a conventional distributional approach to CVaR in a sequential setting and reanalyze the choices of human decision-makers.
We then consider a further critical property of risk sensitivity, namely time consistency, showing alternatives to this form of CVaR.
arXiv Detail & Related papers (2021-11-12T16:27:47Z) - Addressing the Long-term Impact of ML Decisions via Policy Regret [49.92903850297013]
We study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm.
We argue that an acceptable sequential allocation of opportunities must take an arm's potential for growth into account.
We present an algorithm with provably sub-linear policy regret for sufficiently long time horizons.
arXiv Detail & Related papers (2021-06-02T17:38:10Z) - Continuous Mean-Covariance Bandits [39.820490484375156]
We propose a novel Continuous Mean-Covariance Bandit model to take into account option correlation.
In CMCB, there is a learner who sequentially chooses weight vectors on given options and observes random feedback according to the decisions.
We propose novel algorithms with optimal regrets (within logarithmic factors) and provide matching lower bounds to validate their optimalities.
arXiv Detail & Related papers (2021-02-24T06:37:05Z) - Option Hedging with Risk Averse Reinforcement Learning [34.85783251852863]
We show how risk-averse reinforcement learning can be used to hedge options.
We apply a state-of-the-art risk-averse algorithm to a vanilla option hedging environment.
arXiv Detail & Related papers (2020-10-23T09:08:24Z) - Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff
in Regret [115.85354306623368]
We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels.
We propose two provably efficient model-free algorithms, Risk-Sensitive Value Iteration (RSVI) and Risk-Sensitive Q-learning (RSQ)
We prove that RSVI attains an $tildeObig(lambda(|beta| H2) cdot sqrtH3 S2AT big)$ regret, while RSQ attains an $tildeObig(lambda
arXiv Detail & Related papers (2020-06-22T19:28:26Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.