Related papers: Optimizing Deep Reinforcement Learning for American Put Option Hedging

Optimizing Deep Reinforcement Learning for American Put Option Hedging

URL: http://arxiv.org/abs/2405.08602v1
Date: Tue, 14 May 2024 13:41:44 GMT
Title: Optimizing Deep Reinforcement Learning for American Put Option Hedging
Authors: Reilly Pickard, F. Wredenhagen, Y. Lawryshyn,
Abstract summary: This paper contributes to the existing literature on hedging American options with Deep Reinforcement Learning (DRL) Results highlight the importance of avoiding certain combinations, such as high learning rates with a high number of training episodes or low learning rates with few training episodes. This paper demonstrates that both single-train and weekly-train DRL agents outperform the Black-Scholes Delta method at transaction costs of 1% and 3%.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper contributes to the existing literature on hedging American options with Deep Reinforcement Learning (DRL). The study first investigates hyperparameter impact on hedging performance, considering learning rates, training episodes, neural network architectures, training steps, and transaction cost penalty functions. Results highlight the importance of avoiding certain combinations, such as high learning rates with a high number of training episodes or low learning rates with few training episodes and emphasize the significance of utilizing moderate values for optimal outcomes. Additionally, the paper warns against excessive training steps to prevent instability and demonstrates the superiority of a quadratic transaction cost penalty function over a linear version. This study then expands upon the work of Pickard et al. (2024), who utilize a Chebyshev interpolation option pricing method to train DRL agents with market calibrated stochastic volatility models. While the results of Pickard et al. (2024) showed that these DRL agents achieve satisfactory performance on empirical asset paths, this study introduces a novel approach where new agents at weekly intervals to newly calibrated stochastic volatility models. Results show DRL agents re-trained using weekly market data surpass the performance of those trained solely on the sale date. Furthermore, the paper demonstrates that both single-train and weekly-train DRL agents outperform the Black-Scholes Delta method at transaction costs of 1% and 3%. This practical relevance suggests that practitioners can leverage readily available market data to train DRL agents for effective hedging of options in their portfolios.

Related papers

Unsupervised Data Generation for Offline Reinforcement Learning: A Perspective from Model [57.20064815347607]
offline reinforcement learning (RL) recently gains growing interests from RL researchers.<n>The performance of offline RL suffers from the out-of-distribution problem, which can be corrected by feedback in online RL.<n>In this paper, we first build a bridge over the batch data and the performance of offline RL algorithms theoretically.<n>We show that in task-agnostic settings, a series of policies trained by unsupervised RL can minimize the worst-case regret in the performance gap.
arXiv Detail & Related papers (2025-06-24T14:08:36Z)
Dissecting Long Reasoning Models: An Empirical Study [94.31064312707211]
We systematically analyze the roles of positive and negative samples in reinforcement learning (RL)<n>We identify substantial data inefficiency in group relative policy optimization, where over half of the samples yield zero advantage.<n>We investigate unstable performance across various reasoning models and benchmarks, attributing instability to uncertain problems with ambiguous outcomes.
arXiv Detail & Related papers (2025-06-05T11:47:10Z)
Behavior Injection: Preparing Language Models for Reinforcement Learning [24.46625106928253]
Reinforcement fine-tuning (RFT) has emerged as a powerful post-training technique to incentivize the reasoning ability of large language models (LLMs)<n>LLMs can respond very inconsistently to RFT: some show substantial performance gains, while others plateau or even degrade.<n>We propose behavior injection, a task-agnostic data-augmentation scheme applied prior to RL.
arXiv Detail & Related papers (2025-05-25T00:54:50Z)
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning [50.02117478165099]
We show that large-scale reinforcement learning can significantly enhance the reasoning capabilities of strong, small- and mid-sized models.<n>We propose a simple yet effective approach: first training on math-only prompts, then on code-only prompts.
arXiv Detail & Related papers (2025-05-22T08:50:47Z)
Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z)
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining [74.83412846804977]
Reinforcement learning (RL)-based fine-tuning has become a crucial step in post-training language models. We present a systematic end-to-end study of RL fine-tuning for mathematical reasoning by training models entirely from scratch.
arXiv Detail & Related papers (2025-04-10T17:15:53Z)
Fast Adaptation with Behavioral Foundation Models [82.34700481726951]
Unsupervised zero-shot reinforcement learning has emerged as a powerful paradigm for pretraining behavioral foundation models. Despite promising results, zero-shot policies are often suboptimal due to errors induced by the unsupervised training process. We propose fast adaptation strategies that search in the low-dimensional task-embedding space of the pre-trained BFM to rapidly improve the performance of its zero-shot policies.
arXiv Detail & Related papers (2025-04-10T16:14:17Z)
An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders [1.0154385852423122]
reinforcement learning (RL) algorithms have been instrumental in maximizing long-term customer satisfaction and avoiding short-term, myopic goals in industrial recommender systems. The goal is to train an RL agent to maximize the purchase reward given a detailed human instruction describing a desired product. This report also evaluates the RL agents trained using generative trajectories.
arXiv Detail & Related papers (2024-08-28T10:31:50Z)
Hedging American Put Options with Deep Reinforcement Learning [0.0]
This article leverages deep reinforcement learning (DRL) to hedge American put options, utilizing the deep deterministic policy (DDPG) method. The agents are first trained and tested with Geometric Brownian Motion (GBM) asset paths. Results on both simulated and empirical market testing data also suggest the optimality of DRL agents over the BS Delta method in real-world scenarios.
arXiv Detail & Related papers (2024-05-10T18:59:12Z)
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation [61.57833648734164]
We propose a novel Parallel Yielding Re-Activation (PYRA) method for training-inference efficient task adaptation. PYRA outperforms all competing methods under both low compression rate and high compression rate.
arXiv Detail & Related papers (2024-03-14T09:06:49Z)
Augmenting Unsupervised Reinforcement Learning with Self-Reference [63.68018737038331]
Humans possess the ability to draw on past experiences explicitly when learning new tasks. We propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information. Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark.
arXiv Detail & Related papers (2023-11-16T09:07:34Z)
Reusing Pretrained Models by Multi-linear Operators for Efficient Training [65.64075958382034]
Training large models from scratch usually costs a substantial amount of resources. Recent studies such as bert2BERT and LiGO have reused small pretrained models to initialize a large model. We propose a method that linearly correlates each weight of the target model to all the weights of the pretrained model.
arXiv Detail & Related papers (2023-10-16T06:16:47Z)
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning [14.702446153750497]
Worst-case-aware Robust RL (WocaR-RL) is a robust training framework for deep reinforcement learning. We show that WocaR-RL achieves state-of-the-art performance under various strong attacks.
arXiv Detail & Related papers (2022-10-12T05:24:46Z)
Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
Enhancing the Generalization Performance and Speed Up Training for DRL-based Mapless Navigation [18.13884934663477]
DRL agents performing well in training scenarios are found to perform poorly in some unseen real-world scenarios. In this paper, we discuss why the DRL agent fails in such unseen scenarios and find the representation of LiDAR readings is the key factor behind the agent's performance degradation. We propose an easy, but efficient input pre-processing (IP) approach to accelerate training and enhance the performance of the DRL agent in such scenarios.
arXiv Detail & Related papers (2021-03-22T09:36:51Z)
Low-Precision Reinforcement Learning [63.930246183244705]
Low-precision training has become a popular approach to reduce computation time, memory footprint, and energy consumption in supervised learning. In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a na"ive adaptation of low-precision methods from supervised learning fails.
arXiv Detail & Related papers (2021-02-26T16:16:28Z)
Deep Reinforcement Learning for Active High Frequency Trading [1.6874375111244329]
We introduce the first end-to-end Deep Reinforcement Learning (DRL) based framework for active high frequency trading in the stock market. We train DRL agents to trade one unit of Intel Corporation stock by employing the Proximal Policy Optimization algorithm.
arXiv Detail & Related papers (2021-01-18T15:09:28Z)
Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features. We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach. Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.