One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
Offline Reinforcement Learning
- URL: http://arxiv.org/abs/2212.00124v3
- Date: Mon, 30 Oct 2023 15:17:20 GMT
- Title: One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
Offline Reinforcement Learning
- Authors: Marc Rigter, Bruno Lacerda, Nick Hawes
- Abstract summary: We propose risk-sensitivity as a mechanism to jointly address both of these issues.
Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to environmentity.
Our experiments show that our algorithm achieves competitive performance on deterministic benchmarks.
- Score: 25.218430053391884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Offline reinforcement learning (RL) is suitable for safety-critical domains
where online exploration is too costly or dangerous. In such safety-critical
settings, decision-making should take into consideration the risk of
catastrophic outcomes. In other words, decision-making should be
risk-sensitive. Previous works on risk in offline RL combine together offline
RL techniques, to avoid distributional shift, with risk-sensitive RL
algorithms, to achieve risk-sensitivity. In this work, we propose
risk-sensitivity as a mechanism to jointly address both of these issues. Our
model-based approach is risk-averse to both epistemic and aleatoric
uncertainty. Risk-aversion to epistemic uncertainty prevents distributional
shift, as areas not covered by the dataset have high epistemic uncertainty.
Risk-aversion to aleatoric uncertainty discourages actions that may result in
poor outcomes due to environment stochasticity. Our experiments show that our
algorithm achieves competitive performance on deterministic benchmarks, and
outperforms existing approaches for risk-sensitive objectives in stochastic
domains.
Related papers
- Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning [19.292214425524303]
We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes.
Our work focuses on applying the entropic risk measure to RL problems.
We center on the linear Markov Decision Process (MDP) setting, a well-regarded theoretical framework that has yet to be examined from a risk-sensitive standpoint.
arXiv Detail & Related papers (2024-07-10T13:09:52Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Uncertainty-aware Distributional Offline Reinforcement Learning [26.34178581703107]
offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data.
We propose an uncertainty-aware distributional offline RL method to simultaneously address both uncertainty and environmentality.
Our method is rigorously evaluated through comprehensive experiments in both risk-sensitive and risk-neutral benchmarks, demonstrating its superior performance.
arXiv Detail & Related papers (2024-03-26T12:28:04Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles.
We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z) - Distributional Reinforcement Learning with Online Risk-awareness
Adaption [5.363478475460403]
We introduce a novel framework, Distributional RL with Online Risk Adaption (DRL-ORA)
DRL-ORA dynamically selects the epistemic risk levels via solving a total variation minimization problem online.
We show multiple classes of tasks where DRL-ORA outperforms existing methods that rely on either a fixed risk level or manually predetermined risk level.
arXiv Detail & Related papers (2023-10-08T14:32:23Z) - Provably Efficient Iterated CVaR Reinforcement Learning with Function
Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk.
We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations.
We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z) - RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk [28.811725782388688]
We propose and analyze a new framework to jointly model the risk associated with uncertainties in finite-horizon and discounted infinite-horizon MDPs.
We show that when the risk-aversion is defined using either EVaR or the entropic risk, the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level.
arXiv Detail & Related papers (2022-09-09T00:34:58Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Addressing Inherent Uncertainty: Risk-Sensitive Behavior Generation for
Automated Driving using Distributional Reinforcement Learning [0.0]
We propose a two-step approach for risk-sensitive behavior generation for self-driving vehicles.
First, we learn an optimal policy in an uncertain environment with Deep Distributional Reinforcement Learning.
During execution, the optimal risk-sensitive action is selected by applying established risk criteria.
arXiv Detail & Related papers (2021-02-05T11:45:12Z) - Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss.
We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.