Bridging the gap between Markowitz planning and deep reinforcement
learning
- URL: http://arxiv.org/abs/2010.09108v1
- Date: Wed, 30 Sep 2020 04:03:27 GMT
- Title: Bridging the gap between Markowitz planning and deep reinforcement
learning
- Authors: Eric Benhamou, David Saltiel, Sandrine Ungari, Abhishek Mukhopadhyay
- Abstract summary: This paper shows how Deep Reinforcement Learning techniques can shed new lights on portfolio allocation.
The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While researchers in the asset management industry have mostly focused on
techniques based on financial and risk planning techniques like Markowitz
efficient frontier, minimum variance, maximum diversification or equal risk
parity, in parallel, another community in machine learning has started working
on reinforcement learning and more particularly deep reinforcement learning to
solve other decision making problems for challenging task like autonomous
driving, robot learning, and on a more conceptual side games solving like Go.
This paper aims to bridge the gap between these two approaches by showing Deep
Reinforcement Learning (DRL) techniques can shed new lights on portfolio
allocation thanks to a more general optimization setting that casts portfolio
allocation as an optimal control problem that is not just a one-step
optimization, but rather a continuous control optimization with a delayed
reward. The advantages are numerous: (i) DRL maps directly market conditions to
actions by design and hence should adapt to changing environment, (ii) DRL does
not rely on any traditional financial risk assumptions like that risk is
represented by variance, (iii) DRL can incorporate additional data and be a
multi inputs method as opposed to more traditional optimization methods. We
present on an experiment some encouraging results using convolution networks.
Related papers
- Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins.
We employ inverse RL (IRL) to automatically learn reward functions without manual tuning.
We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z) - To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning [2.951820152291149]
In several decision problems, one faces the possibility of policy switching, which incurs a non-negligible cost.
We propose a novel strategy for balancing between the gain and the cost of switching in a flexible and principled way.
We establish fundamental properties and design a Net Actor-Critic algorithm for the proposed novel switching formulation.
arXiv Detail & Related papers (2024-07-01T22:24:31Z) - Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy.
The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms.
We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z) - Deep Reinforcement Learning and Mean-Variance Strategies for Responsible Portfolio Optimization [49.396692286192206]
We study the use of deep reinforcement learning for responsible portfolio optimization by incorporating ESG states and objectives.
Our results show that deep reinforcement learning policies can provide competitive performance against mean-variance approaches for responsible portfolio allocation.
arXiv Detail & Related papers (2024-03-25T12:04:03Z) - Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver.
This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly.
It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z) - A Learnheuristic Approach to A Constrained Multi-Objective Portfolio
Optimisation Problem [0.0]
This paper studies multi-objective portfolio optimisation.
It aims to achieve the objective of maximising the expected return while minimising the risk of a given rate of return.
arXiv Detail & Related papers (2023-04-13T17:05:45Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Reinforcement Learning from Diverse Human Preferences [68.4294547285359]
This paper develops a method for crowd-sourcing preference labels and learning from diverse human preferences.
The proposed method is tested on a variety of tasks in DMcontrol and Meta-world.
It has shown consistent and significant improvements over existing preference-based RL algorithms when learning from diverse feedback.
arXiv Detail & Related papers (2023-01-27T15:18:54Z) - Multi-fidelity reinforcement learning framework for shape optimization [0.8258451067861933]
We introduce a controlled transfer learning framework that leverages a multi-fidelity simulation setting.
Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers.
Our results demonstrate this framework's applicability to other scientific DRL scenarios.
arXiv Detail & Related papers (2022-02-22T20:44:04Z) - Deep Reinforcement Learning and Convex Mean-Variance Optimisation for
Portfolio Management [0.0]
Reinforcement learning (RL) methods do not rely on explicit forecasts and are better suited for multi-stage decision processes.
Experiments were conducted on three markets in different economies with different overall trends.
arXiv Detail & Related papers (2022-02-13T10:12:09Z) - Deep Risk Model: A Deep Learning Solution for Mining Latent Risk Factors
to Improve Covariance Matrix Estimation [8.617532047238461]
We propose a deep learning solution to effectively "design" risk factors with neural networks.
Our method can obtain $1.9%$ higher explained variance measured by $R2$ and also reduce the risk of a global minimum variance portfolio.
arXiv Detail & Related papers (2021-07-12T05:30:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.