Distributional Model Equivalence for Risk-Sensitive Reinforcement
Learning
- URL: http://arxiv.org/abs/2307.01708v2
- Date: Sun, 3 Dec 2023 20:39:10 GMT
- Title: Distributional Model Equivalence for Risk-Sensitive Reinforcement
Learning
- Authors: Tyler Kastner, Murat A. Erdogdu, Amir-massoud Farahmand
- Abstract summary: We leverage distributional reinforcement learning to introduce two new notions of model equivalence.
We demonstrate how our framework can be used to augment any model-free risk-sensitive algorithm.
- Score: 20.449497882324785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of learning models for risk-sensitive reinforcement
learning. We theoretically demonstrate that proper value equivalence, a method
of learning models which can be used to plan optimally in the risk-neutral
setting, is not sufficient to plan optimally in the risk-sensitive setting. We
leverage distributional reinforcement learning to introduce two new notions of
model equivalence, one which is general and can be used to plan for any risk
measure, but is intractable; and a practical variation which allows one to
choose which risk measures they may plan optimally for. We demonstrate how our
framework can be used to augment any model-free risk-sensitive algorithm, and
provide both tabular and large-scale experiments to demonstrate its ability.
Related papers
- Robust Reinforcement Learning with Dynamic Distortion Risk Measures [0.0]
We devise a framework to solve robust risk-aware reinforcement learning problems.
We simultaneously account for environmental uncertainty and risk with a class of dynamic robust distortion risk measures.
We construct an actor-critic algorithm to solve this class of robust risk-aware RL problems.
arXiv Detail & Related papers (2024-09-16T08:54:59Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Mind the Uncertainty: Risk-Aware and Actively Exploring Model-Based
Reinforcement Learning [26.497229327357935]
We introduce a simple but effective method for managing risk in model-based reinforcement learning with trajectory sampling.
Experiments indicate that the separation of uncertainties is essential to performing well with data-driven approaches in uncertain and safety-critical control environments.
arXiv Detail & Related papers (2023-09-11T16:10:58Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - Deep Learning for Systemic Risk Measures [3.274367403737527]
The aim of this paper is to study a new methodological framework for systemic risk measures.
Under this new framework, systemic risk measures can be interpreted as the minimal amount of cash that secures the aggregated system.
Deep learning is increasingly receiving attention in financial modelings and risk management.
arXiv Detail & Related papers (2022-07-02T05:01:19Z) - Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement
Learning [0.0]
We develop an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks.
We also develop a risk-sensitive actor-critic algorithm that uses full episodes and does not require any additional nested transitions.
arXiv Detail & Related papers (2022-06-29T14:11:15Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Entropic Risk Constrained Soft-Robust Policy Optimization [12.362670630646805]
It is important in high-stakes domains to quantify and manage risk induced by model uncertainties.
We propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty.
arXiv Detail & Related papers (2020-06-20T23:48:28Z) - Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss.
We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.