Related papers: An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm

An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm

URL: http://arxiv.org/abs/2208.10707v1
Date: Tue, 23 Aug 2022 03:20:06 GMT
Title: An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm
Authors: Boyi Jin
Abstract summary: This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return. A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The objective function of the optimization model is the weighted sum of the expectation and value at risk(VaR) of portfolio cumulative return. The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return using quantile regression, and actor network outputs the optimal portfolio weight by maximizing the objective function mentioned above. Meanwhile, we exploit a linear transformation function to realize asset short selling. Finally, A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training. To validate our proposed approach, we conduct backtesting for two representative portfolios and observe that the proposed model in this work is superior to the benchmark strategies.

Related papers

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates. We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Reinforcement Learning as an Improvement Heuristic for Real-World Production Scheduling [0.0]
One promising approach is to train an RL agent as an improvement, starting with a suboptimal solution that is iteratively improved by applying small changes. We apply this approach to a real-world multiobjective production scheduling problem. We benchmarked our approach against other approaches using real data from our industry partner, demonstrating its superior performance.
arXiv Detail & Related papers (2024-09-18T12:48:56Z)
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase. We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs. To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z)
Adversarial Style Transfer for Robust Policy Optimization in Deep Reinforcement Learning [13.652106087606471]
This paper proposes an algorithm that aims to improve generalization for reinforcement learning agents by removing overfitting to confounding features. A policy network updates its parameters to minimize the effect of such perturbations, thus staying robust while maximizing the expected future reward. We evaluate our approach on Procgen and Distracting Control Suite for generalization and sample efficiency.
arXiv Detail & Related papers (2023-08-29T18:17:35Z)
Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective. We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z)
Truncating Trajectories in Monte Carlo Reinforcement Learning [48.97155920826079]
In Reinforcement Learning (RL), an agent acts in an unknown environment to maximize the expected cumulative discounted sum of an external reward signal. We propose an a-priori budget allocation strategy that leads to the collection of trajectories of different lengths. We show that an appropriate truncation of the trajectories can succeed in improving performance.
arXiv Detail & Related papers (2023-05-07T19:41:57Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
Proximal Deterministic Policy Gradient [20.951797549505986]
We introduce two techniques to improve off-policy Reinforcement Learning (RL) algorithms. We exploit the two value functions commonly employed in state-of-the-art off-policy algorithms to provide an improved action value estimate. We demonstrate significant performance improvement over state-of-the-art algorithms on standard continuous-control RL benchmarks.
arXiv Detail & Related papers (2020-08-03T10:19:59Z)
Model-based Adversarial Meta-Reinforcement Learning [38.28304764312512]
We propose Model-based Adversarial Meta-Reinforcement Learning (AdMRL) AdMRL aims to minimize the worst-case sub-optimality gap across all tasks in a family of tasks. We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks.
arXiv Detail & Related papers (2020-06-16T02:21:49Z)
Model-Augmented Actor-Critic: Backpropagating through Paths [81.86992776864729]
Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator. We show how to make more effective use of the model by exploiting its differentiability.
arXiv Detail & Related papers (2020-05-16T19:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.