Risk-Averse Learning by Temporal Difference Methods
- URL: http://arxiv.org/abs/2003.00780v1
- Date: Mon, 2 Mar 2020 11:48:09 GMT
- Title: Risk-Averse Learning by Temporal Difference Methods
- Authors: Umit Kose and Andrzej Ruszczynski
- Abstract summary: We consider reinforcement learning with performance evaluated by a dynamic risk measure.
We propose risk-averse counterparts of the methods of temporal differences and we prove their convergence with probability one.
- Score: 5.33024001730262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider reinforcement learning with performance evaluated by a dynamic
risk measure. We construct a projected risk-averse dynamic programming equation
and study its properties. Then we propose risk-averse counterparts of the
methods of temporal differences and we prove their convergence with probability
one. We also perform an empirical study on a complex transportation problem.
Related papers
- HACSurv: A Hierarchical Copula-based Approach for Survival Analysis with Dependent Competing Risks [51.95824566163554]
HACSurv is a survival analysis method that learns structures and cause-specific survival functions from data with competing risks.
By capturing the dependencies between risks and censoring, HACSurv achieves better survival predictions.
arXiv Detail & Related papers (2024-10-19T18:52:18Z) - Robust Reinforcement Learning with Dynamic Distortion Risk Measures [0.0]
We devise a framework to solve robust risk-aware reinforcement learning problems.
We simultaneously account for environmental uncertainty and risk with a class of dynamic robust distortion risk measures.
We construct an actor-critic algorithm to solve this class of robust risk-aware RL problems.
arXiv Detail & Related papers (2024-09-16T08:54:59Z) - Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz
Dynamic Risk Measures [23.46659319363579]
We present two model-based algorithms applied to emphLipschitz dynamic risk measures.
Notably, our upper bounds demonstrate optimal dependencies on the number of actions and episodes.
arXiv Detail & Related papers (2023-06-04T16:24:19Z) - Multivariate Systemic Risk Measures and Computation by Deep Learning
Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations.
The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk [28.811725782388688]
We propose and analyze a new framework to jointly model the risk associated with uncertainties in finite-horizon and discounted infinite-horizon MDPs.
We show that when the risk-aversion is defined using either EVaR or the entropic risk, the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level.
arXiv Detail & Related papers (2022-09-09T00:34:58Z) - Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement
Learning [0.0]
We develop an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks.
We also develop a risk-sensitive actor-critic algorithm that uses full episodes and does not require any additional nested transitions.
arXiv Detail & Related papers (2022-06-29T14:11:15Z) - Risk Perspective Exploration in Distributional Reinforcement Learning [10.441880303257468]
We present risk scheduling approaches that explore risk levels and optimistic behaviors from a risk perspective.
We demonstrate the performance enhancement of the DMIX algorithm using risk scheduling in a multi-agent setting.
arXiv Detail & Related papers (2022-06-28T17:37:34Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.