Towards Generalizable Reinforcement Learning for Trade Execution
- URL: http://arxiv.org/abs/2307.11685v1
- Date: Fri, 12 May 2023 02:41:11 GMT
- Title: Towards Generalizable Reinforcement Learning for Trade Execution
- Authors: Chuheng Zhang, Yitong Duan, Xiaoyu Chen, Jianyu Chen, Jian Li, Li Zhao
- Abstract summary: Reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data.
We find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment.
We propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner.
- Score: 25.199192981742744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimized trade execution is to sell (or buy) a given amount of assets in a
given time with the lowest possible trading cost. Recently, reinforcement
learning (RL) has been applied to optimized trade execution to learn smarter
policies from market data. However, we find that many existing RL methods
exhibit considerable overfitting which prevents them from real deployment. In
this paper, we provide an extensive study on the overfitting problem in
optimized trade execution. First, we model the optimized trade execution as
offline RL with dynamic context (ORDC), where the context represents market
variables that cannot be influenced by the trading policy and are collected in
an offline manner. Under this framework, we derive the generalization bound and
find that the overfitting issue is caused by large context space and limited
context samples in the offline setting. Accordingly, we propose to learn
compact representations for context to address the overfitting problem, either
by leveraging prior knowledge or in an end-to-end manner. To evaluate our
algorithms, we also implement a carefully designed simulator based on
historical limit order book (LOB) data to provide a high-fidelity benchmark for
different algorithms. Our experiments on the high-fidelity simulator
demonstrate that our algorithms can effectively alleviate overfitting and
achieve better performance.
Related papers
- Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling [0.6144680854063939]
We show how $K$-NN resampling can be used to simulate limit order book (LOB) markets.
We also show how our algorithm can calibrate the size of limit orders for a liquidation strategy.
arXiv Detail & Related papers (2024-09-10T13:50:53Z) - SAIL: Self-Improving Efficient Online Alignment of Large Language Models [56.59644677997827]
Reinforcement Learning from Human Feedback is a key method for aligning large language models with human preferences.
Recent literature has focused on designing online RLHF methods but still lacks a unified conceptual formulation.
Our approach significantly improves alignment performance on open-sourced datasets with minimal computational overhead.
arXiv Detail & Related papers (2024-06-21T18:05:35Z) - MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading [6.305870529904885]
We propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market.
Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks.
arXiv Detail & Related papers (2024-06-03T01:42:52Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - A Modular Framework for Reinforcement Learning Optimal Execution [68.8204255655161]
We develop a modular framework for the application of Reinforcement Learning to the problem of Optimal Trade Execution.
The framework is designed with flexibility in mind, in order to ease the implementation of different simulation setups.
arXiv Detail & Related papers (2022-08-11T09:40:42Z) - OptiDICE: Offline Policy Optimization via Stationary Distribution
Correction Estimation [59.469401906712555]
We present an offline reinforcement learning algorithm that prevents overestimation in a more principled way.
Our algorithm, OptiDICE, directly estimates the stationary distribution corrections of the optimal policy.
We show that OptiDICE performs competitively with the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T00:43:30Z) - Interpretable performance analysis towards offline reinforcement
learning: A dataset perspective [6.526790418943535]
We propose a two-fold taxonomy for existing offline RL algorithms.
We explore the correlation between the performance of different types of algorithms and the distribution of actions under states.
We create a benchmark platform on the Atari domain, entitled easy go (RLEG), at an estimated cost of more than 0.3 million dollars.
arXiv Detail & Related papers (2021-05-12T07:17:06Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - MOPO: Model-based Offline Policy Optimization [183.6449600580806]
offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data.
We show that an existing model-based RL algorithm already produces significant gains in the offline setting.
We propose to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics.
arXiv Detail & Related papers (2020-05-27T08:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.