Related papers: A Deep Reinforcement Learning Trader without Offline Training

A Deep Reinforcement Learning Trader without Offline Training

URL: http://arxiv.org/abs/2303.00356v1
Date: Wed, 1 Mar 2023 09:34:52 GMT
Title: A Deep Reinforcement Learning Trader without Offline Training
Authors: Boian Lazov
Abstract summary: We use Double Deep $Q$-learning in the episodic setting with Fast Learning Networks approximating the expected reward $Q$. We define the possible terminal states of an episode in such a way as to introduce a mechanism to conserve some of the money in the trading pool when market conditions are seen as unfavourable.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we pursue the question of a fully online trading algorithm (i.e. one that does not need offline training on previously gathered data). For this task we use Double Deep $Q$-learning in the episodic setting with Fast Learning Networks approximating the expected reward $Q$. Additionally, we define the possible terminal states of an episode in such a way as to introduce a mechanism to conserve some of the money in the trading pool when market conditions are seen as unfavourable. Some of these money are taken as profit and some are reused at a later time according to certain criteria. After describing the algorithm, we test it using the 1-minute-tick data for Cardano's price on Binance. We see that the agent performs better than trading with randomly chosen actions on each timestep. And it does so when tested on the whole dataset as well as on different subsets, capturing different market trends.

Related papers

Reinforcement Learning Pair Trading: A Dynamic Scaling approach [3.4698840925433774]
Trading crypto-currency is difficult due to the inherent volatility of the crypto-market. In this work, we combine Reinforcement Learning (RL) with pair trading. Our results show that RL can significantly outperform manual and traditional pair trading techniques when applied to volatile markets such as cryptocurrencies.
arXiv Detail & Related papers (2024-07-23T00:16:27Z)
A Contextual Online Learning Theory of Brokerage [8.049531918823758]
We study the role of contextual information in the online learning problem of brokerage between traders. We show that if the bounded density assumption is lifted, then the problem becomes unlearnable.
arXiv Detail & Related papers (2024-05-22T18:38:05Z)
Trading Volume Maximization with Online Learning [3.8059763597999012]
We investigate how the broker should behave to maximize the trading volume. We model the traders' valuations as an i.i.d. process with an unknown distribution. If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving poly-logarithmic regret.
arXiv Detail & Related papers (2024-05-21T17:26:44Z)
An Online Learning Theory of Brokerage [3.8059763597999012]
We investigate brokerage between traders from an online learning perspective. Unlike other bilateral trade problems already studied, we focus on the case where there are no designated buyer and seller roles. We show that the optimal rate degrades to $sqrtT$ in the first case, and the problem becomes unlearnable in the second.
arXiv Detail & Related papers (2023-10-18T17:01:32Z)
Cryptocurrency Portfolio Optimization by Neural Networks [81.20955733184398]
This paper proposes an effective algorithm based on neural networks to take advantage of these investment products. A deep neural network, which outputs the allocation weight of each asset at a time interval, is trained to maximize the Sharpe ratio. A novel loss term is proposed to regulate the network's bias towards a specific asset, thus enforcing the network to learn an allocation strategy that is close to a minimum variance strategy.
arXiv Detail & Related papers (2023-10-02T12:33:28Z)
Anytime Model Selection in Linear Bandits [61.97047189786905]
We develop ALEXP, which has an exponentially improved dependence on $M$ for its regret. Our approach utilizes a novel time-uniform analysis of the Lasso, establishing a new connection between online learning and high-dimensional statistics.
arXiv Detail & Related papers (2023-07-24T15:44:30Z)
Uniswap Liquidity Provision: An Online Learning Approach [49.145538162253594]
Decentralized Exchanges (DEXs) are new types of marketplaces leveraging technology. One such DEX, Uniswap v3, allows liquidity providers to allocate funds more efficiently by specifying an active price interval for their funds. This introduces the problem of finding an optimal strategy for choosing price intervals. We formalize this problem as an online learning problem with non-stochastic rewards.
arXiv Detail & Related papers (2023-02-01T17:21:40Z)
Characterizing Datapoints via Second-Split Forgetting [93.99363547536392]
We propose $$-second-$split$ $forgetting$ $time$ (SSFT), a complementary metric that tracks the epoch (if any) after which an original training example is forgotten. We demonstrate that $mislabeled$ examples are forgotten quickly, and seemingly $rare$ examples are forgotten comparatively slowly. SSFT can (i) help to identify mislabeled samples, the removal of which improves generalization; and (ii) provide insights about failure modes.
arXiv Detail & Related papers (2022-10-26T21:03:46Z)
Online Markov Decision Processes with Aggregate Bandit Feedback [74.85532145498742]
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics. In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback. Our main result is a computationally efficient algorithm with $O(sqrtK)$ regret for this setting, where $K$ is the number of episodes.
arXiv Detail & Related papers (2021-01-31T16:49:07Z)
Stock2Vec: A Hybrid Deep Learning Framework for Stock Market Prediction with Representation Learning and Temporal Convolutional Network [71.25144476293507]
We have proposed to develop a global hybrid deep learning framework to predict the daily prices in the stock market. With representation learning, we derived an embedding called Stock2Vec, which gives us insight for the relationship among different stocks. Our hybrid framework integrates both advantages and achieves better performance on the stock price prediction task than several popular benchmarked models.
arXiv Detail & Related papers (2020-09-29T22:54:30Z)
Forecasting Bitcoin closing price series using linear regression and neural networks models [4.17510581764131]
We study how to forecast daily closing price series of Bitcoin using data prices and volumes of prior days. We followed different approaches in parallel, implementing both statistical techniques and machine learning algorithms.
arXiv Detail & Related papers (2020-01-04T21:04:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.