Related papers: Towards Generalizable Reinforcement Learning for Trade Execution

Towards Generalizable Reinforcement Learning for Trade Execution

URL: http://arxiv.org/abs/2307.11685v1
Date: Fri, 12 May 2023 02:41:11 GMT
Title: Towards Generalizable Reinforcement Learning for Trade Execution
Authors: Chuheng Zhang, Yitong Duan, Xiaoyu Chen, Jianyu Chen, Jian Li, Li Zhao
Abstract summary: Reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data. We find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment. We propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner.
Score: 25.199192981742744
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimized trade execution is to sell (or buy) a given amount of assets in a given time with the lowest possible trading cost. Recently, reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data. However, we find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment. In this paper, we provide an extensive study on the overfitting problem in optimized trade execution. First, we model the optimized trade execution as offline RL with dynamic context (ORDC), where the context represents market variables that cannot be influenced by the trading policy and are collected in an offline manner. Under this framework, we derive the generalization bound and find that the overfitting issue is caused by large context space and limited context samples in the offline setting. Accordingly, we propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner. To evaluate our algorithms, we also implement a carefully designed simulator based on historical limit order book (LOB) data to provide a high-fidelity benchmark for different algorithms. Our experiments on the high-fidelity simulator demonstrate that our algorithms can effectively alleviate overfitting and achieve better performance.

Related papers

Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z)
Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling [0.9831489366502301]
Job Shop Scheduling Problem (JSSP) is a complex optimization problem. Online Reinforcement Learning (RL) has shown promise by quickly finding acceptable solutions for JSSP. We introduce Offline Reinforcement Learning for Learning to Dispatch (Offline-LD)
arXiv Detail & Related papers (2024-09-16T15:18:10Z)
Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling [0.6144680854063939]
We show how $K$-NN resampling can be used to simulate limit order book (LOB) markets. We also show how our algorithm can calibrate the size of limit orders for a liquidation strategy.
arXiv Detail & Related papers (2024-09-10T13:50:53Z)
SAIL: Self-Improving Efficient Online Alignment of Large Language Models [56.59644677997827]
Reinforcement Learning from Human Feedback is a key method for aligning large language models with human preferences. Recent literature has focused on designing online RLHF methods but still lacks a unified conceptual formulation. Our approach significantly improves alignment performance on open-sourced datasets with minimal computational overhead.
arXiv Detail & Related papers (2024-06-21T18:05:35Z)
MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading [6.305870529904885]
We propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks.
arXiv Detail & Related papers (2024-06-03T01:42:52Z)
Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online. We extensively ablate these design choices, demonstrating the key factors that most affect performance. We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z)
A Modular Framework for Reinforcement Learning Optimal Execution [68.8204255655161]
We develop a modular framework for the application of Reinforcement Learning to the problem of Optimal Trade Execution. The framework is designed with flexibility in mind, in order to ease the implementation of different simulation setups.
arXiv Detail & Related papers (2022-08-11T09:40:42Z)
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation [59.469401906712555]
We present an offline reinforcement learning algorithm that prevents overestimation in a more principled way. Our algorithm, OptiDICE, directly estimates the stationary distribution corrections of the optimal policy. We show that OptiDICE performs competitively with the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T00:43:30Z)
Interpretable performance analysis towards offline reinforcement learning: A dataset perspective [6.526790418943535]
We propose a two-fold taxonomy for existing offline RL algorithms. We explore the correlation between the performance of different types of algorithms and the distribution of actions under states. We create a benchmark platform on the Atari domain, entitled easy go (RLEG), at an estimated cost of more than 0.3 million dollars.
arXiv Detail & Related papers (2021-05-12T07:17:06Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)
MOPO: Model-based Offline Policy Optimization [183.6449600580806]
offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data. We show that an existing model-based RL algorithm already produces significant gains in the offline setting. We propose to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics.
arXiv Detail & Related papers (2020-05-27T08:46:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.