Hedging using reinforcement learning: Contextual $k$-Armed Bandit versus
$Q$-learning
- URL: http://arxiv.org/abs/2007.01623v2
- Date: Sun, 6 Feb 2022 18:49:39 GMT
- Title: Hedging using reinforcement learning: Contextual $k$-Armed Bandit versus
$Q$-learning
- Authors: Loris Cannelli, Giuseppe Nuti, Marzio Sala, Oleg Szehr
- Abstract summary: We study the construction of replication strategies for contingent claims in the presence of risk and market friction.
In this article, the hedging problem is viewed as an instance of a risk-averse contextual $k$-armed bandit problem.
We find that the $k$-armed bandit model naturally fits to the Profit and Loss formulation of hedging.
- Score: 0.22940141855172028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The construction of replication strategies for contingent claims in the
presence of risk and market friction is a key problem of financial engineering.
In real markets, continuous replication, such as in the model of Black, Scholes
and Merton (BSM), is not only unrealistic but it is also undesirable due to
high transaction costs. A variety of methods have been proposed to balance
between effective replication and losses in the incomplete market setting. With
the rise of Artificial Intelligence (AI), AI-based hedgers have attracted
considerable interest, where particular attention was given to Recurrent Neural
Network systems and variations of the $Q$-learning algorithm. From a practical
point of view, sufficient samples for training such an AI can only be obtained
from a simulator of the market environment. Yet if an agent was trained solely
on simulated data, the run-time performance will primarily reflect the accuracy
of the simulation, which leads to the classical problem of model choice and
calibration. In this article, the hedging problem is viewed as an instance of a
risk-averse contextual $k$-armed bandit problem, which is motivated by the
simplicity and sample-efficiency of the architecture. This allows for realistic
online model updates from real-world data. We find that the $k$-armed bandit
model naturally fits to the Profit and Loss formulation of hedging, providing
for a more accurate and sample efficient approach than $Q$-learning and
reducing to the Black-Scholes model in the absence of transaction costs and
risks.
Related papers
- MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services [94.61039892220037]
We present a novel immersion-aware model trading framework that incentivizes metaverse users (MUs) to contribute learning models for augmented reality (AR) services in the vehicular metaverse.
Considering dynamic network conditions and privacy concerns, we formulate the reward decisions of MSPs as a multi-agent Markov decision process.
Experimental results demonstrate that the proposed framework can effectively provide higher-value models for object detection and classification in AR services on real AR-related vehicle datasets.
arXiv Detail & Related papers (2024-10-25T16:20:46Z) - Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference [5.6679198251041765]
We introduce an online approximation algorithm, named ORRIC, designed to optimize resource allocation for adaptively balancing accuracy of training model and inference.
The competitive ratio of ORRIC outperforms that of the traditional In-ference-Only paradigm, especially when data persists for a sufficiently lengthy time.
arXiv Detail & Related papers (2024-05-25T03:05:19Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Designing an attack-defense game: how to increase robustness of
financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical.
We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input.
We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data.
The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z) - Adversarial Deep Hedging: Learning to Hedge without Price Process
Modeling [4.656182369206814]
We propose a new framework called adversarial deep hedging, inspired by adversarial learning.
In this framework, a hedger and a generator, which respectively model the underlying asset process and the underlying asset process, are trained in an adversarial manner.
arXiv Detail & Related papers (2023-07-25T03:09:32Z) - Anytime Model Selection in Linear Bandits [61.97047189786905]
We develop ALEXP, which has an exponentially improved dependence on $M$ for its regret.
Our approach utilizes a novel time-uniform analysis of the Lasso, establishing a new connection between online learning and high-dimensional statistics.
arXiv Detail & Related papers (2023-07-24T15:44:30Z) - Neural Stochastic Agent-Based Limit Order Book Simulation: A Hybrid
Methodology [6.09170287691728]
Modern financial exchanges use an electronic limit order book (LOB) to store bid and ask orders for a specific financial asset.
We propose a novel hybrid LOB simulation paradigm characterised by: (1) representing the aggregation of market events' logic by a neural background trader that is pre-trained on historical LOB data through a neural point model; and (2) embedding the background trader in a multi-agent simulation with other trading agents.
We show that the stylised facts remain and we demonstrate order flow impact and financial herding behaviours that are in accordance with empirical observations of real markets.
arXiv Detail & Related papers (2023-02-28T20:53:39Z) - Learning to simulate realistic limit order book markets from data as a
World Agent [1.1470070927586016]
Multi-agent market simulators usually require careful calibration to emulate real markets.
Poorly calibrated simulators can lead to misleading conclusions.
We propose a world model simulator that accurately emulates a limit order book market.
arXiv Detail & Related papers (2022-09-26T09:17:11Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - Model-Augmented Q-learning [112.86795579978802]
We propose a MFRL framework that is augmented with the components of model-based RL.
Specifically, we propose to estimate not only the $Q$-values but also both the transition and the reward with a shared network.
We show that the proposed scheme, called Model-augmented $Q$-learning (MQL), obtains a policy-invariant solution which is identical to the solution obtained by learning with true reward.
arXiv Detail & Related papers (2021-02-07T17:56:50Z) - Robust pricing and hedging via neural SDEs [0.0]
We develop and analyse novel algorithms needed for efficient use of neural SDEs.
We find robust bounds for prices of derivatives and the corresponding hedging strategies while incorporating relevant market data.
Neural SDEs allow consistent calibration under both the risk-neutral and the real-world measures.
arXiv Detail & Related papers (2020-07-08T14:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.