Commodities Trading through Deep Policy Gradient Methods
- URL: http://arxiv.org/abs/2309.00630v1
- Date: Thu, 10 Aug 2023 17:21:12 GMT
- Title: Commodities Trading through Deep Policy Gradient Methods
- Authors: Jonas Hanetho
- Abstract summary: It formulates the commodities trading problem as a continuous, discrete-time dynamical system.
Two policy algorithms, namely actor-based and actor-critic-based approaches, are introduced.
Backtesting on front-month natural gas futures demonstrates that DRL models increase the Sharpe ratio by $83%$ compared to the buy-and-hold baseline.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Algorithmic trading has gained attention due to its potential for generating
superior returns. This paper investigates the effectiveness of deep
reinforcement learning (DRL) methods in algorithmic commodities trading. It
formulates the commodities trading problem as a continuous, discrete-time
stochastic dynamical system. The proposed system employs a novel
time-discretization scheme that adapts to market volatility, enhancing the
statistical properties of subsampled financial time series. To optimize
transaction-cost- and risk-sensitive trading agents, two policy gradient
algorithms, namely actor-based and actor-critic-based approaches, are
introduced. These agents utilize CNNs and LSTMs as parametric function
approximators to map historical price observations to market
positions.Backtesting on front-month natural gas futures demonstrates that DRL
models increase the Sharpe ratio by $83\%$ compared to the buy-and-hold
baseline. Additionally, the risk profile of the agents can be customized
through a hyperparameter that regulates risk sensitivity in the reward function
during the optimization process. The actor-based models outperform the
actor-critic-based models, while the CNN-based models show a slight performance
advantage over the LSTM-based models.
Related papers
- A New Way: Kronecker-Factored Approximate Curvature Deep Hedging and its Benefits [0.0]
This paper advances the computational efficiency of Deep Hedging frameworks through the novel integration of Kronecker-Factored Approximate Curvature (K-FAC) optimization.
The proposed architecture couples Long Short-Term Memory (LSTM) networks with K-FAC second-order optimization.
arXiv Detail & Related papers (2024-11-22T15:19:40Z) - VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates.
We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z) - AI-Powered Energy Algorithmic Trading: Integrating Hidden Markov Models with Neural Networks [0.0]
This study introduces a new approach that combines Hidden Markov Models (HMM) and neural networks, integrated with Black-Litterman portfolio optimization.
During the COVID period ( 2019-2022), this dual-model approach achieved a 83% return with a Sharpe ratio of 0.77.
arXiv Detail & Related papers (2024-07-29T10:26:52Z) - Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms [50.808123629394245]
Direct Alignment Algorithms (DDAs) like Direct Preference Optimization have emerged as alternatives to the classical RLHF pipeline.
This work formulates and formalizes the reward over-optimization or hacking problem for DAAs and explores its consequences across objectives, training regimes, and model scales.
arXiv Detail & Related papers (2024-06-05T03:41:37Z) - RVRAE: A Dynamic Factor Model Based on Variational Recurrent Autoencoder
for Stock Returns Prediction [5.281288833470249]
RVRAE is a probabilistic approach that addresses the temporal dependencies and noise in market data.
It is adept at risk modeling in volatile stock markets, estimating variances from latent space distributions while also predicting returns.
arXiv Detail & Related papers (2024-03-04T21:48:32Z) - Deep Policy Gradient Methods in Commodity Markets [0.0]
Traders play an important role in stabilizing markets by providing liquidity and reducing volatility.
This thesis investigates the effectiveness of deep reinforcement learning methods in commodities trading.
arXiv Detail & Related papers (2023-06-14T11:50:23Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility.
Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - GA-MSSR: Genetic Algorithm Maximizing Sharpe and Sterling Ratio Method
for RoboTrading [0.4568777157687961]
Foreign exchange is the largest financial market in the world.
Most literature used historical price information and technical indicators for training.
To address this problem, we designed trading rule features that are derived from technical indicators and trading rules.
arXiv Detail & Related papers (2020-08-16T05:33:35Z) - MOPO: Model-based Offline Policy Optimization [183.6449600580806]
offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data.
We show that an existing model-based RL algorithm already produces significant gains in the offline setting.
We propose to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics.
arXiv Detail & Related papers (2020-05-27T08:46:41Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.