From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement
Learning with Contextual Information
- URL: http://arxiv.org/abs/2310.00642v1
- Date: Sun, 1 Oct 2023 11:25:20 GMT
- Title: From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement
Learning with Contextual Information
- Authors: Zhendong Shi, Xiaoli Wei and Ercan E. Kuruoglu
- Abstract summary: In this study, we use two methods to overcome the issue with contextual information.
In order to investigate strategic trading in quantitative markets, we merged the earlier financial trading strategy known as constant proportion portfolio insurance ( CPPI) into deep deterministic policy gradient (DDPG)
The experimental results show that both methods can accelerate the progress of reinforcement learning to obtain the optimal solution.
- Score: 4.42532447134568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of how to take the right actions to make profits in sequential
process continues to be difficult due to the quick dynamics and a significant
amount of uncertainty in many application scenarios. In such complicated
environments, reinforcement learning (RL), a reward-oriented strategy for
optimum control, has emerged as a potential technique to address this strategic
decision-making issue. However, reinforcement learning also has some
shortcomings that make it unsuitable for solving many financial problems,
excessive resource consumption, and inability to quickly obtain optimal
solutions, making it unsuitable for quantitative trading markets. In this
study, we use two methods to overcome the issue with contextual information:
contextual Thompson sampling and reinforcement learning under supervision which
can accelerate the iterations in search of the best answer. In order to
investigate strategic trading in quantitative markets, we merged the earlier
financial trading strategy known as constant proportion portfolio insurance
(CPPI) into deep deterministic policy gradient (DDPG). The experimental results
show that both methods can accelerate the progress of reinforcement learning to
obtain the optimal solution.
Related papers
- Deep Reinforcement Learning for Online Optimal Execution Strategies [49.1574468325115]
This paper tackles the challenge of learning non-Markovian optimal execution strategies in dynamic financial markets.
We introduce a novel actor-critic algorithm based on Deep Deterministic Policy Gradient (DDPG)
We show that our algorithm successfully approximates the optimal execution strategy.
arXiv Detail & Related papers (2024-10-17T12:38:08Z) - Ensembling Portfolio Strategies for Long-Term Investments: A Distribution-Free Preference Framework for Decision-Making and Algorithms [0.0]
This paper investigates the problem of ensembling multiple strategies for sequential portfolios to outperform individual strategies in terms of long-term wealth.
We introduce a novel framework for decision-making in combining strategies, irrespective of market conditions.
We show results in favor of the proposed strategies, albeit with small tradeoffs in their Sharpe ratios.
arXiv Detail & Related papers (2024-06-05T23:08:57Z) - Risk-reducing design and operations toolkit: 90 strategies for managing
risk and uncertainty in decision problems [65.268245109828]
This paper develops a catalog of such strategies and develops a framework for them.
It argues that they provide an efficient response to decision problems that are seemingly intractable due to high uncertainty.
It then proposes a framework to incorporate them into decision theory using multi-objective optimization.
arXiv Detail & Related papers (2023-09-06T16:14:32Z) - On solving decision and risk management problems subject to uncertainty [91.3755431537592]
Uncertainty is a pervasive challenge in decision and risk management.
This paper develops a systematic understanding of such strategies, determine their range of application, and develop a framework to better employ them.
arXiv Detail & Related papers (2023-01-18T19:16:23Z) - Reinforcement Learning with Stepwise Fairness Constraints [50.538878453547966]
We introduce the study of reinforcement learning with stepwise fairness constraints.
We provide learning algorithms with strong theoretical guarantees in regard to policy optimality and fairness violation.
arXiv Detail & Related papers (2022-11-08T04:06:23Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - Time your hedge with Deep Reinforcement Learning [0.0]
Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions.
We present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging
arXiv Detail & Related papers (2020-09-16T06:43:41Z) - Learning Adaptive Exploration Strategies in Dynamic Environments Through
Informed Policy Regularization [100.72335252255989]
We study the problem of learning exploration-exploitation strategies that effectively adapt to dynamic environments.
We propose a novel algorithm that regularizes the training of an RNN-based policy using informed policies trained to maximize the reward in each task.
arXiv Detail & Related papers (2020-05-06T16:14:48Z) - An Application of Deep Reinforcement Learning to Algorithmic Trading [4.523089386111081]
This scientific research paper presents an innovative approach based on deep reinforcement learning (DRL) to solve the algorithmic trading problem.
It proposes a novel DRL trading strategy so as to maximise the resulting Sharpe ratio performance indicator on a broad range of stock markets.
The training of the resulting reinforcement learning (RL) agent is entirely based on the generation of artificial trajectories from a limited set of stock market historical data.
arXiv Detail & Related papers (2020-04-07T14:57:23Z) - Deep Deterministic Portfolio Optimization [0.0]
This work is to test reinforcement learning algorithms on conceptually simple, but mathematically non-trivial, trading environments.
We study the deep deterministic policy gradient algorithm and show that such a reinforcement learning agent can successfully recover the essential features of the optimal trading strategies.
arXiv Detail & Related papers (2020-03-13T22:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.