Risk-sensitive Actor-Critic with Static Spectral Risk Measures for Online and Offline Reinforcement Learning
- URL: http://arxiv.org/abs/2507.03900v1
- Date: Sat, 05 Jul 2025 04:41:54 GMT
- Title: Risk-sensitive Actor-Critic with Static Spectral Risk Measures for Online and Offline Reinforcement Learning
- Authors: Mehrdad Moghimi, Hyejin Ku,
- Abstract summary: We propose a novel framework for optimizing static Spectral Risk Measures (SRM)<n>Our algorithms consistently outperform existing risk-sensitive methods in both online and offline environments across diverse domains.
- Score: 4.8342038441006805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The development of Distributional Reinforcement Learning (DRL) has introduced a natural way to incorporate risk sensitivity into value-based and actor-critic methods by employing risk measures other than expectation in the value function. While this approach is widely adopted in many online and offline RL algorithms due to its simplicity, the naive integration of risk measures often results in suboptimal policies. This limitation can be particularly harmful in scenarios where the need for effective risk-sensitive policies is critical and worst-case outcomes carry severe consequences. To address this challenge, we propose a novel framework for optimizing static Spectral Risk Measures (SRM), a flexible family of risk measures that generalizes objectives such as CVaR and Mean-CVaR, and enables the tailoring of risk preferences. Our method is applicable to both online and offline RL algorithms. We establish theoretical guarantees by proving convergence in the finite state-action setting. Moreover, through extensive empirical evaluations, we demonstrate that our algorithms consistently outperform existing risk-sensitive methods in both online and offline environments across diverse domains.
Related papers
- Beyond CVaR: Leveraging Static Spectral Risk Measures for Enhanced Decision-Making in Distributional Reinforcement Learning [4.8342038441006805]
In domains such as finance, healthcare, and robotics, managing worst-case scenarios is critical.<n> Distributional Reinforcement Learning (DRL) provides a natural framework to incorporate risk sensitivity into decision-making processes.<n>We present a novel DRL algorithm with convergence guarantees that optimize for a broader class of static Spectral Risk Measures (SRM)
arXiv Detail & Related papers (2025-01-03T20:25:41Z) - Risk-Averse Certification of Bayesian Neural Networks [70.44969603471903]
We propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN.<n>Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN.<n>We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method.
arXiv Detail & Related papers (2024-11-29T14:22:51Z) - Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning [19.292214425524303]
We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes.
Our work focuses on applying the entropic risk measure to RL problems.
We center on the linear Markov Decision Process (MDP) setting, a well-regarded theoretical framework that has yet to be examined from a risk-sensitive standpoint.
arXiv Detail & Related papers (2024-07-10T13:09:52Z) - A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents [44.09686403685058]
We study risk-sensitive RL where the goal is learn a history-dependent policy that optimize some risk measure of cumulative rewards.<n>We propose two meta-algorithms: one grounded in optimism and another based on policy gradients.<n>We empirically show that our algorithms learn the optimal history-dependent policy in a proof-of-concept MDP.
arXiv Detail & Related papers (2024-03-10T21:45:12Z) - Provable Risk-Sensitive Distributional Reinforcement Learning with
General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation.
We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z) - Provably Efficient Iterated CVaR Reinforcement Learning with Function
Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk.
We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations.
We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z) - On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures [17.668631383216233]
Risk-sensitive reinforcement learning (RL) has become a popular tool for controlling the risk of uncertain outcomes.<n>It remains unclear if Policy Gradient (PG) methods enjoy the same global convergence guarantees as in the risk-neutral case.
arXiv Detail & Related papers (2023-01-26T04:35:28Z) - One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
Offline Reinforcement Learning [25.218430053391884]
We propose risk-sensitivity as a mechanism to jointly address both of these issues.
Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to environmentity.
Our experiments show that our algorithm achieves competitive performance on deterministic benchmarks.
arXiv Detail & Related papers (2022-11-30T21:24:11Z) - RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk [28.811725782388688]
We propose and analyze a new framework to jointly model the risk associated with uncertainties in finite-horizon and discounted infinite-horizon MDPs.
We show that when the risk-aversion is defined using either EVaR or the entropic risk, the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level.
arXiv Detail & Related papers (2022-09-09T00:34:58Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Risk-Averse Offline Reinforcement Learning [46.383648750385575]
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due to the risk associated to exploration.
We present the Offline Risk-Averse Actor-Critic (O-RAAC), a model-free RL algorithm that is able to learn risk-averse policies in a fully offline setting.
arXiv Detail & Related papers (2021-02-10T10:27:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.