Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization
- URL: http://arxiv.org/abs/2212.14670v1
- Date: Sun, 11 Dec 2022 07:35:26 GMT
- Title: Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization
- Authors: Xiaodong Li, Pangjing Wu, Chenxin Zou, Qing Li
- Abstract summary: We propose a deep learning and hierarchical reinforcement learning architecture to capture market patterns and execute orders from different temporal scales.
Our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.
- Score: 9.430129571478629
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Designing an intelligent volume-weighted average price (VWAP) strategy is a
critical concern for brokers, since traditional rule-based strategies are
relatively static that cannot achieve a lower transaction cost in a dynamic
market. Many studies have tried to minimize the cost via reinforcement
learning, but there are bottlenecks in improvement, especially for
long-duration strategies such as the VWAP strategy. To address this issue, we
propose a deep learning and hierarchical reinforcement learning jointed
architecture termed Macro-Meta-Micro Trader (M3T) to capture market patterns
and execute orders from different temporal scales. The Macro Trader first
allocates a parent order into tranches based on volume profiles as the
traditional VWAP strategy does, but a long short-term memory neural network is
used to improve the forecasting accuracy. Then the Meta Trader selects a
short-term subgoal appropriate to instant liquidity within each tranche to form
a mini-tranche. The Micro Trader consequently extracts the instant market state
and fulfils the subgoal with the lowest transaction cost. Our experiments over
stocks listed on the Shanghai stock exchange demonstrate that our approach
outperforms baselines in terms of VWAP slippage, with an average cost saving of
1.16 base points compared to the optimal baseline.
Related papers
- The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculation [48.52206677611072]
Speculative decoding aims to speed up autoregressive generation of a language model by verifying in parallel the tokens generated by a smaller draft model.
We show that combinations of simple strategies can achieve significant inference speedups over different tasks.
arXiv Detail & Related papers (2024-11-06T09:23:50Z) - VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates.
We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z) - Deep Reinforcement Learning and Mean-Variance Strategies for Responsible Portfolio Optimization [49.396692286192206]
We study the use of deep reinforcement learning for responsible portfolio optimization by incorporating ESG states and objectives.
Our results show that deep reinforcement learning policies can provide competitive performance against mean-variance approaches for responsible portfolio allocation.
arXiv Detail & Related papers (2024-03-25T12:04:03Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Intelligent Systematic Investment Agent: an ensemble of deep learning
and evolutionary strategies [0.0]
Our paper proposes a new approach for developing long-term investment strategies using an ensemble of evolutionary algorithms and a deep learning model.
Our methodology focuses on building long-term wealth by improving systematic investment planning (SIP) decisions on Exchange Traded Funds (ETF) over a period of time.
arXiv Detail & Related papers (2022-03-24T15:39:05Z) - A Meta-Method for Portfolio Management Using Machine Learning for
Adaptive Strategy Selection [0.0]
The MPM uses XGBoost to learn how to switch between two risk-based portfolio allocation strategies.
The MPM is shown to possess an excellent out-of-sample risk-reward profile, as measured by the Sharpe ratio.
arXiv Detail & Related papers (2021-11-10T20:46:43Z) - Bitcoin Transaction Strategy Construction Based on Deep Reinforcement
Learning [8.431365407963629]
This study proposes a framework for automatic high-frequency bitcoin transactions based on a deep reinforcement learning algorithm-proximal policy optimization (PPO)
The proposed framework can earn excess returns through both the period of volatility and surge, which opens the door to research on building a single cryptocurrency trading strategy based on deep learning.
arXiv Detail & Related papers (2021-09-30T01:24:03Z) - Slow Momentum with Fast Reversion: A Trading Strategy Using Deep
Learning and Changepoint Detection [2.9005223064604078]
We introduce an online change-point detection (CPD) module into a Deep Momentum Network (DMN) pipeline.
Our CPD module outputs a changepoint location and severity score, allowing our model to learn to respond to degrees of disequilibrium.
Using a portfolio of 50, liquid, continuous futures contracts over the period 1990-2020, the addition of the CPD module leads to an improvement in Sharpe ratio of $33%$.
arXiv Detail & Related papers (2021-05-28T10:46:53Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for
Portfolio Optimization and Order Execution [26.698261314897195]
We propose a hierarchical reinforced stock trading system for portfolio management (HRPM)
We decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies.
HRPM achieves significant improvement against many state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-23T12:09:26Z) - Deep Stock Predictions [58.720142291102135]
We consider the design of a trading strategy that performs portfolio optimization using Long Short Term Memory (LSTM) neural networks.
We then customize the loss function used to train the LSTM to increase the profit earned.
We find the LSTM model with the customized loss function to have an improved performance in the training bot over a regressive baseline such as ARIMA.
arXiv Detail & Related papers (2020-06-08T23:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.