Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy
- URL: http://arxiv.org/abs/2511.12120v1
- Date: Sat, 15 Nov 2025 09:15:10 GMT
- Title: Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy
- Authors: Hongyang Yang, Xiao-Yang Liu, Shan Zhong, Anwar Walid,
- Abstract summary: We propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return.<n>We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms.<n>The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.
- Score: 10.667441394970071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio. This work is fully open-sourced at \href{https://github.com/AI4Finance-Foundation/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ens emble-Strategy-ICAIF-2020}{GitHub}.
Related papers
- Deep reinforcement learning for optimal trading with partial information [0.254890465057467]
We study an optimal trading problem, where a trading signal follows an Ornstein-Uhlenbeck process with regime-switching dynamics.<n>We employ a blend of RL and Recurrent Neural Networks (RNN) in order to make the most at extracting underlying information from the trading signal with latent parameters.
arXiv Detail & Related papers (2025-10-31T18:48:59Z) - Trade in Minutes! Rationality-Driven Agentic System for Quantitative Financial Trading [57.28635022507172]
TiMi is a rationality-driven multi-agent system that architecturally decouples strategy development from minute-level deployment.<n>We propose a two-tier analytical paradigm from macro patterns to micro customization, layered programming design for trading bot implementation, and closed-loop optimization driven by mathematical reflection.
arXiv Detail & Related papers (2025-10-06T13:08:55Z) - Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLMs [49.995906301946]
Existing methods usually leverage a fixed strategy to guide Large Language Models (LLMs) to perform mathematical reasoning.<n>Our analysis reveals that the single strategy cannot adapt to problem-specific requirements and thus overlooks the trade-off between effectiveness and efficiency.<n>We propose Planning and Routing through Instance-Specific Modeling (PRISM), a novel framework that decouples mathematical reasoning into two stages: strategy planning and targeted execution.
arXiv Detail & Related papers (2025-09-29T07:22:41Z) - Building crypto portfolios with agentic AI [46.348283638884425]
The rapid growth of crypto markets has opened new opportunities for investors, but at the same time exposed them to high volatility.<n>This paper presents a practical application of a multi-agent system designed to autonomously construct and evaluate crypto-asset allocations.
arXiv Detail & Related papers (2025-07-11T18:03:51Z) - Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z) - Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards [3.9795751586546766]
This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO)<n>Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents.
arXiv Detail & Related papers (2025-02-04T11:45:59Z) - Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution [0.9553307596675155]
We introduce the Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework.
HRT integrates a Proximal Policy Optimization (PPO)-based High-Level Controller (HLC) for strategic stock selection with a Deep Deterministic Policy Gradient (DDPG)-based Low-Level Controller (LLC) tasked with optimizing trade executions to enhance portfolio value.
arXiv Detail & Related papers (2024-10-19T01:29:38Z) - Deep Reinforcement Learning for Online Optimal Execution Strategies [49.1574468325115]
This paper tackles the challenge of learning non-Markovian optimal execution strategies in dynamic financial markets.
We introduce a novel actor-critic algorithm based on Deep Deterministic Policy Gradient (DDPG)
We show that our algorithm successfully approximates the optimal execution strategy.
arXiv Detail & Related papers (2024-10-17T12:38:08Z) - Deep Reinforcement Learning for Traveling Purchaser Problems [63.37136587778153]
The traveling purchaser problem (TPP) is an important optimization problem with broad applications.<n>We propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately.<n> Experiments on various synthetic TPP instances and the TPPLIB benchmark demonstrate that our DRL-based approach can significantly outperform well-established TPPs.
arXiv Detail & Related papers (2024-04-03T05:32:10Z) - Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization [9.430129571478629]
We propose a deep learning and hierarchical reinforcement learning architecture to capture market patterns and execute orders from different temporal scales.
Our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.
arXiv Detail & Related papers (2022-12-11T07:35:26Z) - Deep Deterministic Portfolio Optimization [0.0]
This work is to test reinforcement learning algorithms on conceptually simple, but mathematically non-trivial, trading environments.
We study the deep deterministic policy gradient algorithm and show that such a reinforcement learning agent can successfully recover the essential features of the optimal trading strategies.
arXiv Detail & Related papers (2020-03-13T22:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.