Related papers: Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

URL: http://arxiv.org/abs/2012.12620v2
Date: Sun, 7 Feb 2021 12:37:07 GMT
Title: Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution
Authors: Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao
Abstract summary: We propose a hierarchical reinforced stock trading system for portfolio management (HRPM) We decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies. HRPM achieves significant improvement against many state-of-the-art approaches.
Score: 26.698261314897195
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error. Existing methods are impractical since they usually assume each reallocation can be finished immediately and thus ignoring the price slippage as part of the trading cost. To address these issues, we propose a hierarchical reinforced stock trading system for portfolio management (HRPM). Concretely, we decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies. The high-level policy gives portfolio weights at a lower frequency to maximize the long term profit and invokes the low-level policy to sell or buy the corresponding shares within a short time window at a higher frequency to minimize the trading cost. We train two levels of policies via pre-training scheme and iterative training scheme for data efficiency. Extensive experimental results in the U.S. market and the China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches.

Related papers

Building crypto portfolios with agentic AI [46.348283638884425]
The rapid growth of crypto markets has opened new opportunities for investors, but at the same time exposed them to high volatility.<n>This paper presents a practical application of a multi-agent system designed to autonomously construct and evaluate crypto-asset allocations.
arXiv Detail & Related papers (2025-07-11T18:03:51Z)
Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z)
Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards [3.9795751586546766]
This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO) Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents.
arXiv Detail & Related papers (2025-02-04T11:45:59Z)
Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution [0.9553307596675155]
We introduce the Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework. HRT integrates a Proximal Policy Optimization (PPO)-based High-Level Controller (HLC) for strategic stock selection with a Deep Deterministic Policy Gradient (DDPG)-based Low-Level Controller (LLC) tasked with optimizing trade executions to enhance portfolio value.
arXiv Detail & Related papers (2024-10-19T01:29:38Z)
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates. We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Portfolio Management using Deep Reinforcement Learning [0.0]
We propose a reinforced portfolio manager offering assistance in the allocation of weights to assets. The environment proffers the manager the freedom to go long and even short on the assets. The manager performs financial transactions in a postulated liquid market without any transaction charges.
arXiv Detail & Related papers (2024-05-01T22:28:55Z)
Deep Reinforcement Learning for Traveling Purchaser Problems [63.37136587778153]
The traveling purchaser problem (TPP) is an important optimization problem with broad applications. We propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately. By introducing a meta-learning strategy, the policy network can be trained stably on large-sized TPP instances.
arXiv Detail & Related papers (2024-04-03T05:32:10Z)
Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance [96.73189436721465]
We first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. We propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other. Experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness.
arXiv Detail & Related papers (2023-07-06T16:45:40Z)
Optimizing Trading Strategies in Quantitative Markets using Multi-Agent Reinforcement Learning [11.556829339947031]
This paper explores the fusion of two established financial trading strategies, namely the constant proportion portfolio insurance ( CPPI) and the time-invariant portfolio protection (TIPP) We introduce two novel multi-agent RL (MARL) methods, CPPI-MADDPG and TIPP-MADDPG, tailored for probing strategic trading within quantitative markets. Our empirical findings reveal that the CPPI-MADDPG and TIPP-MADDPG strategies consistently outpace their traditional counterparts.
arXiv Detail & Related papers (2023-03-15T11:47:57Z)
Uniswap Liquidity Provision: An Online Learning Approach [49.145538162253594]
Decentralized Exchanges (DEXs) are new types of marketplaces leveraging technology. One such DEX, Uniswap v3, allows liquidity providers to allocate funds more efficiently by specifying an active price interval for their funds. This introduces the problem of finding an optimal strategy for choosing price intervals. We formalize this problem as an online learning problem with non-stochastic rewards.
arXiv Detail & Related papers (2023-02-01T17:21:40Z)
Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization [9.430129571478629]
We propose a deep learning and hierarchical reinforcement learning architecture to capture market patterns and execute orders from different temporal scales. Our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.
arXiv Detail & Related papers (2022-12-11T07:35:26Z)
MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization [17.759687104376855]
We propose a novel two-stage-based approach for portfolio management. In the first stage, incorporates an imitation learning into the reinforcement learning framework. In the second stage, learns a meta-policy to recognize the market conditions and decide on the most proper learned policy to follow.
arXiv Detail & Related papers (2022-09-01T07:58:06Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)
A Deep Reinforcement Learning Framework for Continuous Intraday Market Bidding [69.37299910149981]
A key component for the successful renewable energy sources integration is the usage of energy storage. We propose a novel modelling framework for the strategic participation of energy storage in the European continuous intraday market. An distributed version of the fitted Q algorithm is chosen for solving this problem due to its sample efficiency. Results indicate that the agent converges to a policy that achieves in average higher total revenues than the benchmark strategy.
arXiv Detail & Related papers (2020-04-13T13:50:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.