A Modular Framework for Reinforcement Learning Optimal Execution
- URL: http://arxiv.org/abs/2208.06244v1
- Date: Thu, 11 Aug 2022 09:40:42 GMT
- Title: A Modular Framework for Reinforcement Learning Optimal Execution
- Authors: Fernando de Meer Pardo, Christoph Auth and Florin Dascalu
- Abstract summary: We develop a modular framework for the application of Reinforcement Learning to the problem of Optimal Trade Execution.
The framework is designed with flexibility in mind, in order to ease the implementation of different simulation setups.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this article, we develop a modular framework for the application of
Reinforcement Learning to the problem of Optimal Trade Execution. The framework
is designed with flexibility in mind, in order to ease the implementation of
different simulation setups. Rather than focusing on agents and optimization
methods, we focus on the environment and break down the necessary requirements
to simulate an Optimal Trade Execution under a Reinforcement Learning framework
such as data pre-processing, construction of observations, action processing,
child order execution, simulation of benchmarks, reward calculations etc. We
give examples of each component, explore the difficulties their individual
implementations \& the interactions between them entail, and discuss the
different phenomena that each component induces in the simulation, highlighting
the divergences between the simulation and the behavior of a real market. We
showcase our modular implementation through a setup that, following a
Time-Weighted Average Price (TWAP) order submission schedule, allows the agent
to exclusively place limit orders, simulates their execution via iterating over
snapshots of the Limit Order Book (LOB), and calculates rewards as the \$
improvement over the price achieved by a TWAP benchmark algorithm following the
same schedule. We also develop evaluation procedures that incorporate iterative
re-training and evaluation of a given agent over intervals of a training
horizon, mimicking how an agent may behave when being continuously retrained as
new market data becomes available and emulating the monitoring practices that
algorithm providers are bound to perform under current regulatory frameworks.
Related papers
- Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling [0.6144680854063939]
We show how $K$-NN resampling can be used to simulate limit order book (LOB) markets.
We also show how our algorithm can calibrate the size of limit orders for a liquidation strategy.
arXiv Detail & Related papers (2024-09-10T13:50:53Z) - Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement [50.481380478458945]
Iterative step-level Process Refinement (IPR) framework provides detailed step-by-step guidance to enhance agent training.
Our experiments on three complex agent tasks demonstrate that our framework outperforms a variety of strong baselines.
arXiv Detail & Related papers (2024-06-17T03:29:13Z) - Optimal simulation-based Bayesian decisions [0.0]
We present a framework for the efficient computation of optimal Bayesian decisions under intractable likelihoods.
We develop active learning schemes to choose where in parameter and action spaces to simulate.
The resulting framework is extremely simulation efficient, typically requiring fewer model calls than the associated posterior inference task alone.
arXiv Detail & Related papers (2023-11-09T20:59:52Z) - Towards Generalizable Reinforcement Learning for Trade Execution [25.199192981742744]
Reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data.
We find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment.
We propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner.
arXiv Detail & Related papers (2023-05-12T02:41:11Z) - Streamlined Framework for Agile Forecasting Model Development towards
Efficient Inventory Management [2.0625936401496237]
This paper proposes a framework for developing forecasting models by streamlining the connections between core components of the developmental process.
The proposed framework enables swift and robust integration of new datasets, experimentation on different algorithms, and selection of the best models.
arXiv Detail & Related papers (2023-04-13T08:52:32Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Optimization-Derived Learning with Essential Convergence Analysis of
Training and Hyper-training [52.39882976848064]
We design a Generalized Krasnoselskii-Mann (GKM) scheme based on fixed-point iterations as our fundamental ODL module.
Under the GKM scheme, a Bilevel Meta Optimization (BMO) algorithmic framework is constructed to solve the optimal training and hyper-training variables together.
arXiv Detail & Related papers (2022-06-16T01:50:25Z) - Adaptive Batching for Gaussian Process Surrogates with Application in
Noisy Level Set Estimation [0.0]
We develop adaptive replicated designs for the process metamodels of experiments.
We use four novel schemes: Multi-Level Adaptive (MLB), Ratcheted Stepwise Uncertainty Reduction (ABSUR), Design with Stepwise Allocation (ADSA) and Deterministic Design with Stepwise Allocation (DDSA)
arXiv Detail & Related papers (2020-03-19T05:30:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.