Solving the optimal stopping problem with reinforcement learning: an
application in financial option exercise
- URL: http://arxiv.org/abs/2208.00765v1
- Date: Thu, 21 Jul 2022 22:52:05 GMT
- Title: Solving the optimal stopping problem with reinforcement learning: an
application in financial option exercise
- Authors: Leonardo Kanashiro Felizardo and Elia Matsumoto and Emilio
Del-Moral-Hernandez
- Abstract summary: We employ a data-driven method that uses Monte Carlo simulation to train and test artificial neural networks.
We propose a different architecture that uses convolutional neural networks (CNN) to deal with the dimensionality problem that arises when we transform the whole history of prices into a Markovian state.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The optimal stopping problem is a category of decision problems with a
specific constrained configuration. It is relevant to various real-world
applications such as finance and management. To solve the optimal stopping
problem, state-of-the-art algorithms in dynamic programming, such as the
least-squares Monte Carlo (LSMC), are employed. This type of algorithm relies
on path simulations using only the last price of the underlying asset as a
state representation. Also, the LSMC was thinking for option valuation where
risk-neutral probabilities can be employed to account for uncertainty. However,
the general optimal stopping problem goals may not fit the requirements of the
LSMC showing auto-correlated prices. We employ a data-driven method that uses
Monte Carlo simulation to train and test artificial neural networks (ANN) to
solve the optimal stopping problem. Using ANN to solve decision problems is not
entirely new. We propose a different architecture that uses convolutional
neural networks (CNN) to deal with the dimensionality problem that arises when
we transform the whole history of prices into a Markovian state. We present
experiments that indicate that our proposed architecture improves results over
the previous implementations under specific simulated time series function
sets. Lastly, we employ our proposed method to compare the optimal exercise of
the financial options problem with the LSMC algorithm. Our experiments show
that our method can capture more accurate exercise opportunities when compared
to the LSMC. We have outstandingly higher (above 974\% improvement) expected
payoff from these exercise policies under the many Monte Carlo simulations that
used the real-world return database on the out-of-sample (test) data.
Related papers
- Can Large Language Models Play Games? A Case Study of A Self-Play
Approach [61.15761840203145]
Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge.
Monte-Carlo Tree Search (MCTS) is a search algorithm that provides reliable decision-making solutions.
This work introduces an innovative approach that bolsters LLMs with MCTS self-play to efficiently resolve turn-based zero-sum games.
arXiv Detail & Related papers (2024-03-08T19:16:29Z) - Optimal simulation-based Bayesian decisions [0.0]
We present a framework for the efficient computation of optimal Bayesian decisions under intractable likelihoods.
We develop active learning schemes to choose where in parameter and action spaces to simulate.
The resulting framework is extremely simulation efficient, typically requiring fewer model calls than the associated posterior inference task alone.
arXiv Detail & Related papers (2023-11-09T20:59:52Z) - Multi-Resolution Active Learning of Fourier Neural Operators [33.63483360957646]
We propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible.
Specifically, we propose a probabilistic multi-resolution FNO and use ensemble Monte-Carlo to develop an effective posterior inference algorithm.
We have shown the advantage of our method in several benchmark operator learning tasks.
arXiv Detail & Related papers (2023-09-29T04:41:27Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - Bayesian Learning of Optimal Policies in Markov Decision Processes with Countably Infinite State-Space [0.0]
We study the problem of optimal control of a family of discrete-time countable state-space Markov Decision Processes.
We propose an algorithm based on Thompson sampling with dynamically-sized episodes.
We show that our algorithm can be applied to develop approximately optimal control algorithms.
arXiv Detail & Related papers (2023-06-05T03:57:16Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment.
We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Bayesian Optimisation for Constrained Problems [0.0]
We propose a novel variant of the well-known Knowledge Gradient acquisition function that allows it to handle constraints.
We empirically compare the new algorithm with four other state-of-the-art constrained Bayesian optimisation algorithms and demonstrate its superior performance.
arXiv Detail & Related papers (2021-05-27T15:43:09Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.