Related papers: Risk-Averse Learning with Varying Risk Levels

Risk-Averse Learning with Varying Risk Levels

URL: http://arxiv.org/abs/2512.22986v1
Date: Sun, 28 Dec 2025 16:09:29 GMT
Title: Risk-Averse Learning with Varying Risk Levels
Authors: Siyi Wang, Zifan Wang, Karl H. Johansson,
Abstract summary: This work investigates risk-averse online optimization in dynamic environments with varying risk levels.<n>To capture the dynamics of the environment and risk levels, we employ the function variation metric and introduce a novel risk-level variation metric.<n>We develop risk-averse learning algorithms with a limited sampling budget and analyze their dynamic regret bounds in terms of function variation, risk-level variation, and the total number of samples.
Score: 8.646001948552264
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In safety-critical decision-making, the environment may evolve over time, and the learner adjusts its risk level accordingly. This work investigates risk-averse online optimization in dynamic environments with varying risk levels, employing Conditional Value-at-Risk (CVaR) as the risk measure. To capture the dynamics of the environment and risk levels, we employ the function variation metric and introduce a novel risk-level variation metric. Two information settings are considered: a first-order scenario, where the learner observes both function values and their gradients; and a zeroth-order scenario, where only function evaluations are available. For both cases, we develop risk-averse learning algorithms with a limited sampling budget and analyze their dynamic regret bounds in terms of function variation, risk-level variation, and the total number of samples. The regret analysis demonstrates the adaptability of the algorithms in non-stationary and risk-sensitive settings. Finally, numerical experiments are presented to demonstrate the efficacy of the methods.

Related papers

Constrained Language Model Policy Optimization via Risk-aware Stepwise Alignment [49.2305683068875]
We propose Risk-aware Stepwise Alignment (RSA), a novel alignment method that incorporates risk awareness into the policy optimization process.<n> RSA mitigates risks induced by excessive model shift away from a reference policy, and it explicitly suppresses low-probability yet high-impact harmful behaviors.<n> Experimental results demonstrate that our method achieves high levels of helpfulness while ensuring strong safety.
arXiv Detail & Related papers (2025-12-30T14:38:02Z)
RADAR: A Risk-Aware Dynamic Multi-Agent Framework for LLM Safety Evaluation via Role-Specialized Collaboration [81.38705556267917]
Existing safety evaluation methods for large language models (LLMs) suffer from inherent limitations.<n>We introduce a theoretical framework that reconstructs the underlying risk concept space.<n>We propose RADAR, a multi-agent collaborative evaluation framework.
arXiv Detail & Related papers (2025-09-28T09:35:32Z)
Risk-Averse Reinforcement Learning with Itakura-Saito Loss [63.620958078179356]
Risk-averse agents choose policies that minimize risk, occasionally sacrificing expected value.<n>We introduce a numerically stable and mathematically sound loss function based on the Itakura-Saito divergence for learning state-value and action-value functions.<n>In the experimental section, we explore multiple scenarios, some with known analytical solutions, and show that the considered loss function outperforms the alternatives.
arXiv Detail & Related papers (2025-05-22T17:18:07Z)
Risk-averse learning with delayed feedback [17.626195546400247]
Delayed feedback makes it challenging to assess and manage risk effectively.<n>We develop two risk-averse learning algorithms that rely on one-point and two-point zeroth-order optimization approaches.
arXiv Detail & Related papers (2024-09-25T12:32:22Z)
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty [5.710971447109951]
This paper studies continuous-time risk-sensitive reinforcement learning (RL) I highlight that the conventional policy gradient representation is inadequate for risk-sensitive problems due to the nonlinear nature of quadratic variation. I prove the convergence of the proposed algorithm for Merton's investment problem and quantify the impact of temperature parameter on the behavior of the learning procedure.
arXiv Detail & Related papers (2024-04-19T03:05:41Z)
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values. We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z)
DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption [9.191326295161725]
We propose a new framework, Distributional RL with Online Risk Adaptation (DRL-ORA)<n>The framework unifies the existing variants of risk adaption approaches and offers better explainability and flexibility.<n>We show that DRL-ORA outperforms existing methods that rely on fixed risk levels or manually designed risk level adaptation in multiple classes of tasks.
arXiv Detail & Related papers (2023-10-08T14:32:23Z)
Is Risk-Sensitive Reinforcement Learning Properly Resolved? [54.00107408956307]
We propose a novel algorithm, namely Trajectory Q-Learning (TQL), for RSRL problems with provable policy improvement.<n>Based on our new learning architecture, we are free to introduce a general and practical implementation for different risk measures to learn disparate risk-sensitive policies.
arXiv Detail & Related papers (2023-07-02T11:47:21Z)
Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning [0.0]
We develop an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks. We also develop a risk-sensitive actor-critic algorithm that uses full episodes and does not require any additional nested transitions.
arXiv Detail & Related papers (2022-06-29T14:11:15Z)
Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z)
Adaptive Risk Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning [17.940958199767234]
We present a distributional reinforcement learning framework to learn adaptive risk tendency policies. We show our algorithm can adjust its risk-sensitivity on the fly both in simulation and real-world experiments.
arXiv Detail & Related papers (2022-03-28T13:39:58Z)
Automatic Risk Adaptation in Distributional Reinforcement Learning [26.113528145137497]
The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes. This is especially important in safety-critical environments, where errors can lead to high costs or damage. We show reduced failure rates by up to a factor of 7 and improved generalization performance by up to 14% compared to both risk-aware and risk-agnostic agents.
arXiv Detail & Related papers (2021-06-11T11:31:04Z)
Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss. We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.