A Distribution Optimization Framework for Confidence Bounds of Risk
Measures
- URL: http://arxiv.org/abs/2306.07059v1
- Date: Mon, 12 Jun 2023 12:13:06 GMT
- Title: A Distribution Optimization Framework for Confidence Bounds of Risk
Measures
- Authors: Hao Liang, Zhi-quan Luo
- Abstract summary: We present a distribution optimization framework that significantly improves confidence bounds for various risk measures compared to previous methods.
Our framework encompasses popular risk measures such as the entropic risk measure, conditional value at risk (CVaR), spectral risk measure, distortion risk measure, equivalent certainty, and rank-dependent expected utility.
- Score: 23.46659319363579
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a distribution optimization framework that significantly improves
confidence bounds for various risk measures compared to previous methods. Our
framework encompasses popular risk measures such as the entropic risk measure,
conditional value at risk (CVaR), spectral risk measure, distortion risk
measure, equivalent certainty, and rank-dependent expected utility, which are
well established in risk-sensitive decision-making literature. To achieve this,
we introduce two estimation schemes based on concentration bounds derived from
the empirical distribution, specifically using either the Wasserstein distance
or the supremum distance. Unlike traditional approaches that add or subtract a
confidence radius from the empirical risk measures, our proposed schemes
evaluate a specific transformation of the empirical distribution based on the
distance. Consequently, our confidence bounds consistently yield tighter
results compared to previous methods. We further verify the efficacy of the
proposed framework by providing tighter problem-dependent regret bound for the
CVaR bandit.
Related papers
- Data-driven decision-making under uncertainty with entropic risk measure [5.407319151576265]
The entropic risk measure is widely used in high-stakes decision making to account for tail risks associated with an uncertain loss.
To debias the empirical entropic risk estimator, we propose a strongly consistent bootstrapping procedure.
We show that cross validation methods can result in significantly higher out-of-sample risk for the insurer if the bias in validation performance is not corrected for.
arXiv Detail & Related papers (2024-09-30T04:02:52Z) - Predictive Uncertainty Quantification via Risk Decompositions for Strictly Proper Scoring Rules [7.0549244915538765]
Uncertainty in predictive modeling often relies on ad hoc methods.
This paper introduces a theoretical approach to understanding uncertainty through statistical risks.
We show how to split pointwise risk into Bayes risk and excess risk.
arXiv Detail & Related papers (2024-02-16T14:40:22Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Pitfall of Optimism: Distributional Reinforcement Learning by
Randomizing Risk Criterion [9.35556128467037]
We present a novel distributional reinforcement learning algorithm that selects actions by randomizing risk criterion to avoid one-sided tendency on risk.
Our theoretical results support that the proposed method does not fall into biased exploration and is guaranteed to converge to an optimal return.
arXiv Detail & Related papers (2023-10-25T10:53:04Z) - Domain Generalization without Excess Empirical Risk [83.26052467843725]
A common approach is designing a data-driven surrogate penalty to capture generalization and minimize the empirical risk jointly with the penalty.
We argue that a significant failure mode of this recipe is an excess risk due to an erroneous penalty or hardness in joint optimization.
We present an approach that eliminates this problem. Instead of jointly minimizing empirical risk with the penalty, we minimize the penalty under the constraint of optimality of the empirical risk.
arXiv Detail & Related papers (2023-08-30T08:46:46Z) - A Risk-Sensitive Approach to Policy Optimization [21.684251937825234]
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy.
We propose a more direct approach whereby risk-sensitive objectives, specified in terms of the cumulative distribution function (CDF) of the distribution of full-episode rewards, are optimized.
We demonstrate that the use of moderately "pessimistic" risk profiles, which emphasize scenarios where the agent performs poorly, leads to enhanced exploration and a continual focus on addressing deficiencies.
arXiv Detail & Related papers (2022-08-19T00:55:05Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss.
We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.