Related papers: Deep Reinforcement Learning for Stock Portfolio Optimization

Deep Reinforcement Learning for Stock Portfolio Optimization

URL: http://arxiv.org/abs/2012.06325v1
Date: Wed, 9 Dec 2020 10:19:12 GMT
Title: Deep Reinforcement Learning for Stock Portfolio Optimization
Authors: Le Trung Hieu
Abstract summary: We will formulate the problem such that we can apply Reinforcement Learning for the task properly. To maintain a realistic assumption about the market, we will incorporate transaction cost and risk factor into the state as well. We will present the end-to-end solution for the task with Minimum Variance Portfolio for stock subset selection, and Wavelet Transform for extracting multi-frequency data pattern.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Stock portfolio optimization is the process of constant re-distribution of money to a pool of various stocks. In this paper, we will formulate the problem such that we can apply Reinforcement Learning for the task properly. To maintain a realistic assumption about the market, we will incorporate transaction cost and risk factor into the state as well. On top of that, we will apply various state-of-the-art Deep Reinforcement Learning algorithms for comparison. Since the action space is continuous, the realistic formulation were tested under a family of state-of-the-art continuous policy gradients algorithms: Deep Deterministic Policy Gradient (DDPG), Generalized Deterministic Policy Gradient (GDPG) and Proximal Policy Optimization (PPO), where the former two perform much better than the last one. Next, we will present the end-to-end solution for the task with Minimum Variance Portfolio Theory for stock subset selection, and Wavelet Transform for extracting multi-frequency data pattern. Observations and hypothesis were discussed about the results, as well as possible future research directions.1

Related papers

Deep Reinforcement Learning Algorithms for Option Hedging [0.20482269513546458]
We compare the performance of eight Deep Reinforcement Learning (DRL) algorithms in the context of dynamic hedging. MCPG is the only algorithm to outperform the Black-Scholes delta hedge baseline with the allotted computational budget.
arXiv Detail & Related papers (2025-04-07T21:32:14Z)
Distributionally Robust Policy Learning under Concept Drifts [33.44768994272614]
This paper studies a more nuanced problem -- robust policy learning under the concept drift. We first provide a doubly-robust estimator for evaluating the worst-case average reward of a given policy. We then propose a learning algorithm that outputs the policy maximizing the estimated policy value within a given policy class.
arXiv Detail & Related papers (2024-12-18T19:53:56Z)
Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning [14.260168974085376]
This paper investigates multi-objective reinforcement learning (MORL) It focuses on learning optimal policies in the presence of multiple reward functions. Despite MORL's success, there is still a lack of satisfactory understanding of various MORL optimization targets and efficient learning algorithms.
arXiv Detail & Related papers (2024-07-24T17:58:49Z)
Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems. In common practice, convergence (hyper)policies are learned only to deploy their deterministic version. We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z)
A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes [13.466249082564213]
We propose an optimistic variant of PPO for episodic adversarial linear MDPs with full-information feedback. Compared with existing policy-based algorithms, we achieve the state-of-the-art regret bound in both linear MDPs and adversarial linear MDPs with full information.
arXiv Detail & Related papers (2023-05-15T17:55:24Z)
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy. Many algorithms for IRL have an inherently nested structure. We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z)
Nearly Optimal Latent State Decoding in Block MDPs [74.51224067640717]
In episodic Block MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states. We are first interested in estimating the latent state decoding function based on data generated under a fixed behavior policy. We then study the problem of learning near-optimal policies in the reward-free framework.
arXiv Detail & Related papers (2022-08-17T18:49:53Z)
A Reinforcement Learning Approach to the Stochastic Cutting Stock Problem [0.0]
We propose a formulation of the cutting stock problem as a discounted infinite-horizon decision process. An optimal solution corresponds to a policy that associates each state with a decision and minimizes the expected total cost.
arXiv Detail & Related papers (2021-09-20T14:47:54Z)
Deep Reinforcement Learning for Optimal Stopping with Application in Financial Engineering [1.52292571922932]
We employ deep Reinforcement Learning to learn optimal stopping policies in two financial engineering applications: option pricing, and optimal option exercise. We present for the first time a comprehensive empirical evaluation of the quality of optimal stopping policies identified by three state of the art deep RL algorithms.
arXiv Detail & Related papers (2021-05-19T01:52:04Z)
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria. We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z)
Provably Good Batch Reinforcement Learning Without Great Exploration [51.51462608429621]
Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks. Recent algorithms have shown promise but can still be overly optimistic in their expected outcomes. We show that a small modification to Bellman optimality and evaluation back-up to take a more conservative update can have much stronger guarantees.
arXiv Detail & Related papers (2020-07-16T09:25:54Z)
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization [105.7510838453122]
We study the Safe Reinforcement Learning (SRL) problem using the Constrained Markov Decision Process (CMDP) formulation. We present an provably efficient online policy optimization algorithm for CMDP with safe exploration in the function approximation setting.
arXiv Detail & Related papers (2020-03-01T17:47:03Z)
Provably Efficient Exploration in Policy Optimization [117.09887790160406]
This paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO) OPPO achieves $tildeO(sqrtd2 H3 T )$ regret. To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores.
arXiv Detail & Related papers (2019-12-12T08:40:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.