Deep Reinforcement Learning for Equal Risk Pricing and Hedging under
Dynamic Expectile Risk Measures
- URL: http://arxiv.org/abs/2109.04001v1
- Date: Thu, 9 Sep 2021 02:52:06 GMT
- Title: Deep Reinforcement Learning for Equal Risk Pricing and Hedging under
Dynamic Expectile Risk Measures
- Authors: Saeed Marzban, Erick Delage, Jonathan Yumeng Li
- Abstract summary: We show that a new off-policy deterministic actor-critic deep reinforcement learning algorithm can identify high quality time consistent hedging policies for options.
Our numerical experiments, which involve both a simple vanilla option and a more exotic basket option, confirm that the new algorithm can produce 1) in simple environments, nearly optimal hedging policies, and highly accurate prices, simultaneously for a range of maturities.
Overall, hedging strategies that actually outperform the strategies produced using static risk measures when the risk is evaluated at later points of time.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently equal risk pricing, a framework for fair derivative pricing, was
extended to consider dynamic risk measures. However, all current
implementations either employ a static risk measure that violates time
consistency, or are based on traditional dynamic programming solution schemes
that are impracticable in problems with a large number of underlying assets
(due to the curse of dimensionality) or with incomplete asset dynamics
information. In this paper, we extend for the first time a famous off-policy
deterministic actor-critic deep reinforcement learning (ACRL) algorithm to the
problem of solving a risk averse Markov decision process that models risk using
a time consistent recursive expectile risk measure. This new ACRL algorithm
allows us to identify high quality time consistent hedging policies (and equal
risk prices) for options, such as basket options, that cannot be handled using
traditional methods, or in context where only historical trajectories of the
underlying assets are available. Our numerical experiments, which involve both
a simple vanilla option and a more exotic basket option, confirm that the new
ACRL algorithm can produce 1) in simple environments, nearly optimal hedging
policies, and highly accurate prices, simultaneously for a range of maturities
2) in complex environments, good quality policies and prices using reasonable
amount of computing resources; and 3) overall, hedging strategies that actually
outperform the strategies produced using static risk measures when the risk is
evaluated at later points of time.
Related papers
- Robust Reinforcement Learning with Dynamic Distortion Risk Measures [0.0]
We devise a framework to solve robust risk-aware reinforcement learning problems.
We simultaneously account for environmental uncertainty and risk with a class of dynamic robust distortion risk measures.
We construct an actor-critic algorithm to solve this class of robust risk-aware RL problems.
arXiv Detail & Related papers (2024-09-16T08:54:59Z) - Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles.
We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z) - SafeAR: Safe Algorithmic Recourse by Risk-Aware Policies [2.291948092032746]
We present a method to compute recourse policies that consider variability in cost.
We show how existing recourse desiderata can fail to capture the risk of higher costs.
arXiv Detail & Related papers (2023-08-23T18:12:11Z) - Adaptive Risk-Aware Bidding with Budget Constraint in Display
Advertising [47.14651340748015]
We propose a novel adaptive risk-aware bidding algorithm with budget constraint via reinforcement learning.
We theoretically unveil the intrinsic relation between the uncertainty and the risk tendency based on value at risk (VaR)
arXiv Detail & Related papers (2022-12-06T18:50:09Z) - Deep Learning for Systemic Risk Measures [3.274367403737527]
The aim of this paper is to study a new methodological framework for systemic risk measures.
Under this new framework, systemic risk measures can be interpreted as the minimal amount of cash that secures the aggregated system.
Deep learning is increasingly receiving attention in financial modelings and risk management.
arXiv Detail & Related papers (2022-07-02T05:01:19Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Reinforcement Learning with Dynamic Convex Risk Measures [0.0]
We develop an approach for solving time-consistent risk-sensitive optimization problems using model-free reinforcement learning (RL)
We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules.
arXiv Detail & Related papers (2021-12-26T16:41:05Z) - Risk Conditioned Neural Motion Planning [14.018786843419862]
Risk-bounded motion planning is an important yet difficult problem for safety-critical tasks.
We propose an extension of soft actor critic model to estimate the execution risk of a plan through a risk critic.
We show the advantage of our model in terms of both computational time and plan quality, compared to a state-of-the-art mathematical programming baseline.
arXiv Detail & Related papers (2021-08-04T05:33:52Z) - Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds
Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria.
We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.