Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for
Portfolio Optimization and Order Execution
- URL: http://arxiv.org/abs/2012.12620v2
- Date: Sun, 7 Feb 2021 12:37:07 GMT
- Title: Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for
Portfolio Optimization and Order Execution
- Authors: Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao
- Abstract summary: We propose a hierarchical reinforced stock trading system for portfolio management (HRPM)
We decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies.
HRPM achieves significant improvement against many state-of-the-art approaches.
- Score: 26.698261314897195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Portfolio management via reinforcement learning is at the forefront of
fintech research, which explores how to optimally reallocate a fund into
different financial assets over the long term by trial-and-error. Existing
methods are impractical since they usually assume each reallocation can be
finished immediately and thus ignoring the price slippage as part of the
trading cost. To address these issues, we propose a hierarchical reinforced
stock trading system for portfolio management (HRPM). Concretely, we decompose
the trading process into a hierarchy of portfolio management over trade
execution and train the corresponding policies. The high-level policy gives
portfolio weights at a lower frequency to maximize the long term profit and
invokes the low-level policy to sell or buy the corresponding shares within a
short time window at a higher frequency to minimize the trading cost. We train
two levels of policies via pre-training scheme and iterative training scheme
for data efficiency. Extensive experimental results in the U.S. market and the
China market demonstrate that HRPM achieves significant improvement against
many state-of-the-art approaches.
Related papers
- Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution [0.9553307596675155]
We introduce the Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework.
HRT integrates a Proximal Policy Optimization (PPO)-based High-Level Controller (HLC) for strategic stock selection with a Deep Deterministic Policy Gradient (DDPG)-based Low-Level Controller (LLC) tasked with optimizing trade executions to enhance portfolio value.
arXiv Detail & Related papers (2024-10-19T01:29:38Z) - VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates.
We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z) - Portfolio Management using Deep Reinforcement Learning [0.0]
We propose a reinforced portfolio manager offering assistance in the allocation of weights to assets.
The environment proffers the manager the freedom to go long and even short on the assets.
The manager performs financial transactions in a postulated liquid market without any transaction charges.
arXiv Detail & Related papers (2024-05-01T22:28:55Z) - Deep Reinforcement Learning for Traveling Purchaser Problems [63.37136587778153]
The traveling purchaser problem (TPP) is an important optimization problem with broad applications.
We propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately.
By introducing a meta-learning strategy, the policy network can be trained stably on large-sized TPP instances.
arXiv Detail & Related papers (2024-04-03T05:32:10Z) - Learning Multi-Agent Intention-Aware Communication for Optimal
Multi-Order Execution in Finance [96.73189436721465]
We first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints.
We propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other.
Experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness.
arXiv Detail & Related papers (2023-07-06T16:45:40Z) - Optimizing Trading Strategies in Quantitative Markets using Multi-Agent
Reinforcement Learning [11.556829339947031]
This paper explores the fusion of two established financial trading strategies, namely the constant proportion portfolio insurance ( CPPI) and the time-invariant portfolio protection (TIPP)
We introduce two novel multi-agent RL (MARL) methods, CPPI-MADDPG and TIPP-MADDPG, tailored for probing strategic trading within quantitative markets.
Our empirical findings reveal that the CPPI-MADDPG and TIPP-MADDPG strategies consistently outpace their traditional counterparts.
arXiv Detail & Related papers (2023-03-15T11:47:57Z) - Uniswap Liquidity Provision: An Online Learning Approach [49.145538162253594]
Decentralized Exchanges (DEXs) are new types of marketplaces leveraging technology.
One such DEX, Uniswap v3, allows liquidity providers to allocate funds more efficiently by specifying an active price interval for their funds.
This introduces the problem of finding an optimal strategy for choosing price intervals.
We formalize this problem as an online learning problem with non-stochastic rewards.
arXiv Detail & Related papers (2023-02-01T17:21:40Z) - Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization [9.430129571478629]
We propose a deep learning and hierarchical reinforcement learning architecture to capture market patterns and execute orders from different temporal scales.
Our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.
arXiv Detail & Related papers (2022-12-11T07:35:26Z) - MetaTrader: An Reinforcement Learning Approach Integrating Diverse
Policies for Portfolio Optimization [17.759687104376855]
We propose a novel two-stage-based approach for portfolio management.
In the first stage, incorporates an imitation learning into the reinforcement learning framework.
In the second stage, learns a meta-policy to recognize the market conditions and decide on the most proper learned policy to follow.
arXiv Detail & Related papers (2022-09-01T07:58:06Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - A Deep Reinforcement Learning Framework for Continuous Intraday Market
Bidding [69.37299910149981]
A key component for the successful renewable energy sources integration is the usage of energy storage.
We propose a novel modelling framework for the strategic participation of energy storage in the European continuous intraday market.
An distributed version of the fitted Q algorithm is chosen for solving this problem due to its sample efficiency.
Results indicate that the agent converges to a policy that achieves in average higher total revenues than the benchmark strategy.
arXiv Detail & Related papers (2020-04-13T13:50:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.