Deep Reinforcement Learning and Convex Mean-Variance Optimisation for
Portfolio Management
- URL: http://arxiv.org/abs/2203.11318v1
- Date: Sun, 13 Feb 2022 10:12:09 GMT
- Title: Deep Reinforcement Learning and Convex Mean-Variance Optimisation for
Portfolio Management
- Authors: Ruan Pretorius and Terence van Zyl
- Abstract summary: Reinforcement learning (RL) methods do not rely on explicit forecasts and are better suited for multi-stage decision processes.
Experiments were conducted on three markets in different economies with different overall trends.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Traditional portfolio management methods can incorporate specific investor
preferences but rely on accurate forecasts of asset returns and covariances.
Reinforcement learning (RL) methods do not rely on these explicit forecasts and
are better suited for multi-stage decision processes. To address limitations of
the evaluated research, experiments were conducted on three markets in
different economies with different overall trends. By incorporating specific
investor preferences into our RL models' reward functions, a more comprehensive
comparison could be made to traditional methods in risk-return space.
Transaction costs were also modelled more realistically by including nonlinear
changes introduced by market volatility and trading volume. The results of this
study suggest that there can be an advantage to using RL methods compared to
traditional convex mean-variance optimisation methods under certain market
conditions. Our RL models could significantly outperform traditional
single-period optimisation (SPO) and multi-period optimisation (MPO) models in
upward trending markets, but only up to specific risk limits. In sideways
trending markets, the performance of SPO and MPO models can be closely matched
by our RL models for the majority of the excess risk range tested. The specific
market conditions under which these models could outperform each other
highlight the importance of a more comprehensive comparison of Pareto optimal
frontiers in risk-return space. These frontiers give investors a more granular
view of which models might provide better performance for their specific risk
tolerance or return targets.
Related papers
- Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [64.67540769692074]
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date.
We introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models.
Experiments with both human and AI feedback data demonstrate that MMPO consistently outperforms baseline methods, often by a substantial margin, on popular benchmarks including MT-bench and RewardBench.
arXiv Detail & Related papers (2024-10-04T04:56:11Z) - Optimizing Portfolio with Two-Sided Transactions and Lending: A Reinforcement Learning Framework [0.0]
This study presents a Reinforcement Learning-based portfolio management model tailored for high-risk environments.
We implement the model using the Soft Actor-Critic (SAC) agent with a Convolutional Neural Network with Multi-Head Attention.
Tested over two 16-month periods of varying market volatility, the model significantly outperformed benchmarks.
arXiv Detail & Related papers (2024-08-09T23:36:58Z) - Combining Transformer based Deep Reinforcement Learning with
Black-Litterman Model for Portfolio Optimization [0.0]
As a model-free algorithm, deep reinforcement learning (DRL) agent learns and makes decisions by interacting with the environment in an unsupervised way.
We propose a hybrid portfolio optimization model combining the DRL agent and the Black-Litterman (BL) model.
Our DRL agent significantly outperforms various comparison portfolio choice strategies and alternative DRL frameworks by at least 42% in terms of accumulated return.
arXiv Detail & Related papers (2024-02-23T16:01:37Z) - Deep Hedging with Market Impact [0.20482269513546458]
We propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL)
The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging.
arXiv Detail & Related papers (2024-02-20T19:08:24Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Diffusion Variational Autoencoder for Tackling Stochasticity in
Multi-Step Regression Stock Price Prediction [54.21695754082441]
Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility.
Current solutions to multi-step stock price prediction are mostly designed for single-step, classification-based predictions.
We combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction.
Our model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance.
arXiv Detail & Related papers (2023-08-18T16:21:15Z) - HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and
Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting.
We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors.
Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z) - Can Perturbations Help Reduce Investment Risks? Risk-Aware Stock
Recommendation via Split Variational Adversarial Training [44.7991257631318]
We propose a novel Split Variational Adrial Training (SVAT) method for risk-aware stock recommendation.
By lowering the volatility of the stock recommendation model, SVAT effectively reduces investment risks and outperforms state-of-the-art baselines by more than 30% in terms of risk-adjusted profits.
arXiv Detail & Related papers (2023-04-20T12:10:12Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - Deep Learning Statistical Arbitrage [0.0]
We propose a unifying conceptual framework for statistical arbitrage and develop a novel deep learning solution.
We construct arbitrage portfolios of similar assets as residual portfolios from conditional latent asset pricing factors.
We extract the time series signals of these residual portfolios with one of the most powerful machine learning time-series solutions.
arXiv Detail & Related papers (2021-06-08T00:48:25Z) - Bridging the gap between Markowitz planning and deep reinforcement
learning [0.0]
This paper shows how Deep Reinforcement Learning techniques can shed new lights on portfolio allocation.
The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods.
arXiv Detail & Related papers (2020-09-30T04:03:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.