Explainable Deep Reinforcement Learning for Portfolio Management: An
Empirical Approach
- URL: http://arxiv.org/abs/2111.03995v1
- Date: Sun, 7 Nov 2021 04:23:48 GMT
- Title: Explainable Deep Reinforcement Learning for Portfolio Management: An
Empirical Approach
- Authors: Mao Guan, Xiao-Yang Liu
- Abstract summary: It is challenging to understand a DRL-based trading strategy because of the black-box nature of deep neural networks.
We propose an empirical approach to explain the strategies of DRL agents for the portfolio management task.
- Score: 30.283740528236752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL) has been widely studied in the portfolio
management task. However, it is challenging to understand a DRL-based trading
strategy because of the black-box nature of deep neural networks. In this
paper, we propose an empirical approach to explain the strategies of DRL agents
for the portfolio management task. First, we use a linear model in hindsight as
the reference model, which finds the best portfolio weights by assuming knowing
actual stock returns in foresight. In particular, we use the coefficients of a
linear model in hindsight as the reference feature weights. Secondly, for DRL
agents, we use integrated gradients to define the feature weights, which are
the coefficients between reward and features under a linear regression model.
Thirdly, we study the prediction power in two cases, single-step prediction and
multi-step prediction. In particular, we quantify the prediction power by
calculating the linear correlations between the feature weights of a DRL agent
and the reference feature weights, and similarly for machine learning methods.
Finally, we evaluate a portfolio management task on Dow Jones 30 constituent
stocks during 01/01/2009 to 09/01/2021. Our approach empirically reveals that a
DRL agent exhibits a stronger multi-step prediction power than machine learning
methods.
Related papers
- Explainable Post hoc Portfolio Management Financial Policy of a Deep Reinforcement Learning agent [44.99833362998488]
We develop a novel Explainable Deep Reinforcement Learning (XDRL) approach for portfolio management.
By executing our methodology, we can interpret in prediction time the actions of the agent to assess whether they follow the requisites of an investment policy.
arXiv Detail & Related papers (2024-07-19T17:40:39Z) - UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning [10.593924216046977]
We first theoretically analyze overestimation phenomenon led by MSE and provide the theoretical upper bound of the overestimated error.
At last, we propose the offline RL algorithm based on underestimated operator and diffusion policy model.
arXiv Detail & Related papers (2024-06-05T14:37:42Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Combining Transformer based Deep Reinforcement Learning with
Black-Litterman Model for Portfolio Optimization [0.0]
As a model-free algorithm, deep reinforcement learning (DRL) agent learns and makes decisions by interacting with the environment in an unsupervised way.
We propose a hybrid portfolio optimization model combining the DRL agent and the Black-Litterman (BL) model.
Our DRL agent significantly outperforms various comparison portfolio choice strategies and alternative DRL frameworks by at least 42% in terms of accumulated return.
arXiv Detail & Related papers (2024-02-23T16:01:37Z) - RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$ [12.111848705677142]
We propose RL$3$, a hybrid approach that incorporates action-values, learned per task through traditional RL, in the inputs to meta-RL.
We show that RL$3$ earns greater cumulative reward in the long term, compared to RL$2$, while maintaining data-efficiency in the short term, and generalizes better to out-of-distribution tasks.
arXiv Detail & Related papers (2023-06-28T04:16:16Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - An intelligent algorithmic trading based on a risk-return reinforcement
learning algorithm [0.0]
This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm.
The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return.
A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training.
arXiv Detail & Related papers (2022-08-23T03:20:06Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - MOPO: Model-based Offline Policy Optimization [183.6449600580806]
offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data.
We show that an existing model-based RL algorithm already produces significant gains in the offline setting.
We propose to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics.
arXiv Detail & Related papers (2020-05-27T08:46:41Z) - MOReL : Model-Based Offline Reinforcement Learning [49.30091375141527]
In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment.
We present MOReL, an algorithmic framework for model-based offline RL.
We show that MOReL matches or exceeds state-of-the-art results in widely studied offline RL benchmarks.
arXiv Detail & Related papers (2020-05-12T17:52:43Z) - Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function.
We develop an approach for representation learning in RL that sits in between these two extremes.
This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.