Extreme Risk Mitigation in Reinforcement Learning using Extreme Value
Theory
- URL: http://arxiv.org/abs/2308.13011v1
- Date: Thu, 24 Aug 2023 18:23:59 GMT
- Title: Extreme Risk Mitigation in Reinforcement Learning using Extreme Value
Theory
- Authors: Karthik Somayaji NS, Yu Wang, Malachi Schram, Jan Drgona, Mahantesh
Halappanavar, Frank Liu, Peng Li
- Abstract summary: A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes.
While risk-aware RL techniques do exist, their level of risk aversion heavily relies on the precision of the state-action value function estimation.
Our work proposes to enhance the resilience of RL agents when faced with very rare and risky events by focusing on refining the predictions of the extreme values predicted by the state-action value function distribution.
- Score: 10.288413564829579
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Risk-sensitive reinforcement learning (RL) has garnered significant attention
in recent years due to the growing interest in deploying RL agents in
real-world scenarios. A critical aspect of risk awareness involves modeling
highly rare risk events (rewards) that could potentially lead to catastrophic
outcomes. These infrequent occurrences present a formidable challenge for
data-driven methods aiming to capture such risky events accurately. While
risk-aware RL techniques do exist, their level of risk aversion heavily relies
on the precision of the state-action value function estimation when modeling
these rare occurrences. Our work proposes to enhance the resilience of RL
agents when faced with very rare and risky events by focusing on refining the
predictions of the extreme values predicted by the state-action value function
distribution. To achieve this, we formulate the extreme values of the
state-action value function distribution as parameterized distributions,
drawing inspiration from the principles of extreme value theory (EVT). This
approach effectively addresses the issue of infrequent occurrence by leveraging
EVT-based parameterization. Importantly, we theoretically demonstrate the
advantages of employing these parameterized distributions in contrast to other
risk-averse algorithms. Our evaluations show that the proposed method
outperforms other risk averse RL algorithms on a diverse range of benchmark
tasks, each encompassing distinct risk scenarios.
Related papers
- Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty [5.710971447109951]
This paper studies continuous-time risk-sensitive reinforcement learning (RL)
I highlight that the conventional policy gradient representation is inadequate for risk-sensitive problems due to the nonlinear nature of quadratic variation.
I prove the convergence of the proposed algorithm for Merton's investment problem and quantify the impact of temperature parameter on the behavior of the learning procedure.
arXiv Detail & Related papers (2024-04-19T03:05:41Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Provable Risk-Sensitive Distributional Reinforcement Learning with
General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation.
We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z) - Risk-Aware Reinforcement Learning through Optimal Transport Theory [4.8951183832371]
This paper pioneers the integration of Optimal Transport theory with reinforcement learning (RL) to create a risk-aware framework.
Our approach modifies the objective function, ensuring that the resulting policy not only maximizes expected rewards but also respects risk constraints dictated by OT distances.
Our contributions are substantiated with a series of theorems, mapping the relationships between risk distributions, optimal value functions, and policy behaviors.
arXiv Detail & Related papers (2023-09-12T13:55:01Z) - Provably Efficient Iterated CVaR Reinforcement Learning with Function
Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk.
We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations.
We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z) - One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
Offline Reinforcement Learning [25.218430053391884]
We propose risk-sensitivity as a mechanism to jointly address both of these issues.
Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to environmentity.
Our experiments show that our algorithm achieves competitive performance on deterministic benchmarks.
arXiv Detail & Related papers (2022-11-30T21:24:11Z) - RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk [28.811725782388688]
We propose and analyze a new framework to jointly model the risk associated with uncertainties in finite-horizon and discounted infinite-horizon MDPs.
We show that when the risk-aversion is defined using either EVaR or the entropic risk, the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level.
arXiv Detail & Related papers (2022-09-09T00:34:58Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Automatic Risk Adaptation in Distributional Reinforcement Learning [26.113528145137497]
The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes.
This is especially important in safety-critical environments, where errors can lead to high costs or damage.
We show reduced failure rates by up to a factor of 7 and improved generalization performance by up to 14% compared to both risk-aware and risk-agnostic agents.
arXiv Detail & Related papers (2021-06-11T11:31:04Z) - Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds
Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria.
We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z) - Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss.
We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.