Dirichlet policies for reinforced factor portfolios
- URL: http://arxiv.org/abs/2011.05381v3
- Date: Fri, 25 Jun 2021 13:51:07 GMT
- Title: Dirichlet policies for reinforced factor portfolios
- Authors: Eric Andr\'e and Guillaume Coqueret
- Abstract summary: This article aims to combine factor investing and reinforcement learning (RL)
The agent learns through sequential random allocations which rely on firms' characteristics.
Across a large range of parametric choices, our result indicates that RL-based portfolios are very close to the equally-weighted (1/N) allocation.
- Score: 1.3706331473063877
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article aims to combine factor investing and reinforcement learning
(RL). The agent learns through sequential random allocations which rely on
firms' characteristics. Using Dirichlet distributions as the driving policy, we
derive closed forms for the policy gradients and analytical properties of the
performance measure. This enables the implementation of REINFORCE methods,
which we perform on a large dataset of US equities. Across a large range of
parametric choices, our result indicates that RL-based portfolios are very
close to the equally-weighted (1/N) allocation. This implies that the agent
learns to be *agnostic* with regard to factors, which can partly be explained
by cross-sectional regressions showing a strong time variation in the
relationship between returns and firm characteristics.
Related papers
- Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC.
We increase the consistency and informativeness of the pairwise preference signals through targeted modifications.
We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z) - Combining Transformer based Deep Reinforcement Learning with
Black-Litterman Model for Portfolio Optimization [0.0]
As a model-free algorithm, deep reinforcement learning (DRL) agent learns and makes decisions by interacting with the environment in an unsupervised way.
We propose a hybrid portfolio optimization model combining the DRL agent and the Black-Litterman (BL) model.
Our DRL agent significantly outperforms various comparison portfolio choice strategies and alternative DRL frameworks by at least 42% in terms of accumulated return.
arXiv Detail & Related papers (2024-02-23T16:01:37Z) - Beyond Expected Return: Accounting for Policy Reproducibility when
Evaluating Reinforcement Learning Algorithms [9.649114720478872]
Many applications in Reinforcement Learning (RL) have noise ority present in the environment.
These uncertainties lead the exact same policy to perform differently, from one roll-out to another.
Common evaluation procedures in RL summarise the consequent return distributions using solely the expected return, which does not account for the spread of the distribution.
Our work defines this spread as the policy: the ability of a policy to obtain similar performance when rolled out many times, a crucial property in some real-world applications.
arXiv Detail & Related papers (2023-12-12T11:22:31Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - Truncating Trajectories in Monte Carlo Reinforcement Learning [48.97155920826079]
In Reinforcement Learning (RL), an agent acts in an unknown environment to maximize the expected cumulative discounted sum of an external reward signal.
We propose an a-priori budget allocation strategy that leads to the collection of trajectories of different lengths.
We show that an appropriate truncation of the trajectories can succeed in improving performance.
arXiv Detail & Related papers (2023-05-07T19:41:57Z) - On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality
Perspective [5.8010446129208155]
This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates.
The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks.
arXiv Detail & Related papers (2023-04-26T21:43:42Z) - Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem.
We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z) - Distributional constrained reinforcement learning for supply chain
optimization [0.0]
We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for reliable constraint satisfaction in reinforcement learning.
We show that DCPO improves the rate at which the RL policy converges and ensures reliable constraint satisfaction by the end of training.
arXiv Detail & Related papers (2023-02-03T13:43:02Z) - Reinforcement Learning with Intrinsic Affinity for Personalized Asset
Management [0.0]
We develop a regularization method that ensures that strategies have global intrinsic affinities.
We capitalize on these intrinsic affinities to make our model inherently interpretable.
We demonstrate how RL agents can be trained to orchestrate such individual policies for particular personality profiles and still achieve high returns.
arXiv Detail & Related papers (2022-04-20T04:33:32Z) - Distributional Reinforcement Learning for Multi-Dimensional Reward
Functions [91.88969237680669]
We introduce Multi-Dimensional Distributional DQN (MD3QN) to model the joint return distribution from multiple reward sources.
As a by-product of joint distribution modeling, MD3QN can capture the randomness in returns for each source of reward.
In experiments, our method accurately models the joint return distribution in environments with richly correlated reward functions.
arXiv Detail & Related papers (2021-10-26T11:24:23Z) - Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs)
Semi-implicit actor (SIA) powered by a flexible policy distribution.
We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.