Related papers: Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards

Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards

URL: http://arxiv.org/abs/2502.02619v1
Date: Tue, 04 Feb 2025 11:45:59 GMT
Title: Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards
Authors: Daniil Karzanov, Rubén Garzón, Mikhail Terekhov, Caglar Gulcehre, Thomas Raffinot, Marcin Detyniecki,
Abstract summary: This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO)<n>Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents.
Score: 3.9795751586546766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO). Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents. Our target is to enhance the traditional 60/40 benchmark (60% stocks, 40% bonds) by employing the Regret-based Sharpe reward function. To address the impact of transaction fee frictions and prevent signal loss, we develop a transaction cost scheduler. We introduce a future-looking reward function and employ synthetic data training through a circular block bootstrap method to facilitate the learning of generalizable allocation strategies. We focus on two key evaluation measures: return and maximum drawdown. Given the high stochasticity of financial markets, we train 20 independent agents each period and evaluate their average performance against the benchmark. Our method not only enhances the performance of the existing portfolio strategy through strategic rebalancing but also demonstrates strong results compared to other baselines.

Related papers

DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development. We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents. We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning [69.55982246413046]
We propose explicit policy optimization (EPO) for strategic reasoning. EPO provides strategies in open-ended action space and can be plugged into arbitrary LLM agents to motivate goal-directed behavior. Experiments across social and physical domains demonstrate EPO's ability of long-term goal alignment.
arXiv Detail & Related papers (2025-02-18T03:15:55Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Quantum-Inspired Portfolio Optimization In The QUBO Framework [0.0]
A quantum-inspired optimization approach is proposed to study the portfolio optimization aimed at selecting an optimal mix of assets. This research contributes to the growing body of literature on quantum-inspired techniques in finance, demonstrating its potential as a useful tool for asset allocation and portfolio management.
arXiv Detail & Related papers (2024-10-08T11:36:43Z)
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates. We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Deep Reinforcement Learning and Mean-Variance Strategies for Responsible Portfolio Optimization [49.396692286192206]
We study the use of deep reinforcement learning for responsible portfolio optimization by incorporating ESG states and objectives. Our results show that deep reinforcement learning policies can provide competitive performance against mean-variance approaches for responsible portfolio allocation.
arXiv Detail & Related papers (2024-03-25T12:04:03Z)
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning [42.303733194571905]
We seek to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment.
arXiv Detail & Related papers (2023-06-27T16:10:36Z)
Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization [9.430129571478629]
We propose a deep learning and hierarchical reinforcement learning architecture to capture market patterns and execute orders from different temporal scales. Our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.
arXiv Detail & Related papers (2022-12-11T07:35:26Z)
Asset Allocation: From Markowitz to Deep Reinforcement Learning [2.0305676256390934]
Asset allocation is an investment strategy that aims to balance risk and reward by constantly redistributing the portfolio's assets. We conduct an extensive benchmark study to determine the efficacy and reliability of a number of optimization techniques.
arXiv Detail & Related papers (2022-07-14T14:44:04Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)
Deep Reinforcement Learning for Long-Short Portfolio Optimization [7.131902599861306]
This paper constructs a Deep Reinforcement Learning (DRL) portfolio management framework with short-selling mechanisms conforming to actual trading rules. Key innovations include development of a comprehensive short-selling mechanism in continuous trading that accounts for dynamic evolution of transactions across time periods. Compared to traditional approaches, this model delivers superior risk-adjusted returns while reducing maximum drawdown.
arXiv Detail & Related papers (2020-12-26T16:25:20Z)
Time your hedge with Deep Reinforcement Learning [0.0]
Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions. We present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging
arXiv Detail & Related papers (2020-09-16T06:43:41Z)
Deep Learning for Portfolio Optimization [5.833272638548154]
Instead of selecting individual assets, we trade Exchange-Traded Funds (ETFs) of market indices to form a portfolio. We compare our method with a wide range of algorithms with results showing that our model obtains the best performance over the testing period.
arXiv Detail & Related papers (2020-05-27T21:28:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.