Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
- URL: http://arxiv.org/abs/2412.16175v1
- Date: Sun, 08 Dec 2024 15:31:10 GMT
- Title: Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
- Authors: Yilie Huang, Yanwei Jia, Xun Yu Zhou,
- Abstract summary: We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors.
We present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients.
The results demonstrate that the continuous-time RL strategies are consistently among the best especially in a volatile bear market.
- Score: 10.404992912881601
- License:
- Abstract: We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm into four variants, and carry out an extensive empirical study to compare their performance, in terms of a host of common metrics, with a large number of widely used portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the continuous-time RL strategies are consistently among the best especially in a volatile bear market, and decisively outperform the model-based continuous-time counterparts by significant margins.
Related papers
- Uncertainty quantification for Markov chains with application to temporal difference learning [63.49764856675643]
We develop novel high-dimensional concentration inequalities and Berry-Esseen bounds for vector- and matrix-valued functions of Markov chains.
We analyze the TD learning algorithm, a widely used method for policy evaluation in reinforcement learning.
arXiv Detail & Related papers (2025-02-19T15:33:55Z) - Risk-averse policies for natural gas futures trading using distributional reinforcement learning [0.0]
This paper studies the effectiveness of three distributional RL algorithms for natural gas futures trading.
To the best of our knowledge, these algorithms have never been applied in a trading context.
We show that training C51 and IQN to maximize CVaR produces risk-sensitive policies with adjustable risk aversion.
arXiv Detail & Related papers (2025-01-08T11:11:25Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates.
We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z) - Mean-Variance Portfolio Selection in Long-Term Investments with Unknown Distribution: Online Estimation, Risk Aversion under Ambiguity, and Universality of Algorithms [0.0]
This paper adopts a perspective where data gradually and continuously reveal over time.
The performance of these proposed strategies is guaranteed under specific markets.
In stationary and ergodic markets, the so-called Bayesian strategy utilizing true conditional distributions, based on observed past market information during investment, almost surely does not perform better than the proposed strategies in terms of empirical utility, Sharpe ratio, or growth rate, which, in contrast, do not rely on conditional distributions.
arXiv Detail & Related papers (2024-06-19T12:11:42Z) - Statistical arbitrage in multi-pair trading strategy based on graph clustering algorithms in US equities market [0.0]
The study seeks to develop an effective strategy based on the novel framework of statistical arbitrage based on graph clustering algorithms.
The study seeks to provide an integrated approach to optimal signal detection and risk management.
arXiv Detail & Related papers (2024-06-15T17:25:32Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - An Ensemble Method of Deep Reinforcement Learning for Automated
Cryptocurrency Trading [16.78239969166596]
We propose an ensemble method to improve the generalization performance of trading strategies trained by deep reinforcement learning algorithms.
Our proposed ensemble method improves the out-of-sample performance compared with the benchmarks of a deep reinforcement learning strategy and a passive investment strategy.
arXiv Detail & Related papers (2023-07-27T04:00:09Z) - Deep Reinforcement Learning and Convex Mean-Variance Optimisation for
Portfolio Management [0.0]
Reinforcement learning (RL) methods do not rely on explicit forecasts and are better suited for multi-stage decision processes.
Experiments were conducted on three markets in different economies with different overall trends.
arXiv Detail & Related papers (2022-02-13T10:12:09Z) - Deep Q-Learning Market Makers in a Multi-Agent Simulated Stock Market [58.720142291102135]
This paper focuses precisely on the study of these markets makers strategies from an agent-based perspective.
We propose the application of Reinforcement Learning (RL) for the creation of intelligent market markers in simulated stock markets.
arXiv Detail & Related papers (2021-12-08T14:55:21Z) - ARISE: ApeRIodic SEmi-parametric Process for Efficient Markets without
Periodogram and Gaussianity Assumptions [91.3755431537592]
We present the ApeRI-miodic (ARISE) process for investigating efficient markets.
The ARISE process is formulated as an infinite-sum of some known processes and employs the aperiodic spectrum estimation.
In practice, we apply the ARISE function to identify the efficiency of real-world markets.
arXiv Detail & Related papers (2021-11-08T03:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.