MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading
- URL: http://arxiv.org/abs/2407.01577v1
- Date: Mon, 3 Jun 2024 01:42:52 GMT
- Title: MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading
- Authors: Xi Cheng, Jinghao Zhang, Yunan Zeng, Wenfang Xue,
- Abstract summary: We propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market.
Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks.
- Score: 6.305870529904885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT,which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT.
Related papers
- Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment [65.15914284008973]
State-of-the-art techniques such as Reinforcement Learning from Human Feedback (RLHF) often consist of two stages.
1) supervised fine-tuning (SFT), where the model is fine-tuned by learning from human demonstration data.
2) Preference learning, where preference data is used to learn a reward model, which is in turn used by a reinforcement learning step to fine-tune the model.
arXiv Detail & Related papers (2024-05-28T07:11:05Z) - Combining Deep Learning on Order Books with Reinforcement Learning for
Profitable Trading [0.0]
This work focuses on forecasting returns across multiple horizons using order flow and training three temporal-difference imbalance learning models for five financial instruments.
The results prove potential but require further minimal modifications for consistently profitable trading to fully handle retail trading costs, slippage, and spread fluctuation.
arXiv Detail & Related papers (2023-10-24T15:58:58Z) - IMM: An Imitative Reinforcement Learning Approach with Predictive
Representation Learning for Automatic Market Making [33.23156884634365]
Reinforcement Learning technology has achieved remarkable success in quantitative trading.
Most existing RL-based market making methods focus on optimizing single-price level strategies.
We propose Imitative Market Maker (IMM), a novel RL framework leveraging both knowledge from suboptimal signal-based experts and direct policy interactions.
arXiv Detail & Related papers (2023-08-17T11:04:09Z) - Supervised Pretraining Can Learn In-Context Reinforcement Learning [96.62869749926415]
In this paper, we study the in-context learning capabilities of transformers in decision-making problems.
We introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action.
We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline.
arXiv Detail & Related papers (2023-06-26T17:58:50Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - Asynchronous Deep Double Duelling Q-Learning for Trading-Signal
Execution in Limit Order Book Markets [5.202524136984542]
We employ deep reinforcement learning to train an agent to translate a high-frequency trading signal into a trading strategy that places individual limit orders.
Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment.
We find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a benchmark trading strategy having access to the same signal.
arXiv Detail & Related papers (2023-01-20T17:19:18Z) - Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels.
We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z) - Deep Reinforcement Learning Approach for Trading Automation in The Stock
Market [0.0]
This paper presents a model to generate profitable trades in the stock market using Deep Reinforcement Learning (DRL) algorithms.
We formulate the trading problem as a Partially Observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market.
We then solve the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm reporting a 2.68 Sharpe Ratio on unseen data set.
arXiv Detail & Related papers (2022-07-05T11:34:29Z) - Deep Q-Learning Market Makers in a Multi-Agent Simulated Stock Market [58.720142291102135]
This paper focuses precisely on the study of these markets makers strategies from an agent-based perspective.
We propose the application of Reinforcement Learning (RL) for the creation of intelligent market markers in simulated stock markets.
arXiv Detail & Related papers (2021-12-08T14:55:21Z) - Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor
and Optimal Transport [8.617532047238461]
We propose a novel architecture, Temporal Adaptor (TRA), to empower existing stock prediction models with the ability to model multiple stock trading patterns.
TRA is a lightweight module that consists of a set independent predictors for learning multiple patterns as well as a router to dispatch samples to different predictors.
We show that the proposed method can improve information coefficient (IC) from 0.053 to 0.059 and 0.051 to 0.056 respectively.
arXiv Detail & Related papers (2021-06-24T12:19:45Z) - Taking Over the Stock Market: Adversarial Perturbations Against
Algorithmic Traders [47.32228513808444]
We present a realistic scenario in which an attacker influences algorithmic trading systems by using adversarial learning techniques.
We show that when added to the input stream, our perturbation can fool the trading algorithms at future unseen data points.
arXiv Detail & Related papers (2020-10-19T06:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.