Trading with the Momentum Transformer: An Intelligent and Interpretable
Architecture
- URL: http://arxiv.org/abs/2112.08534v1
- Date: Thu, 16 Dec 2021 00:04:12 GMT
- Title: Trading with the Momentum Transformer: An Intelligent and Interpretable
Architecture
- Authors: Kieran Wood, Sven Giegerich, Stephen Roberts, Stefan Zohren
- Abstract summary: We introduce the Momentum Transformer, an attention-based architecture which outperforms the benchmarks.
We observe remarkable structure in the attention patterns, with significant peaks of importance at momentum turning points.
Through the addition of an interpretable variable selection network, we observe how CPD helps our model to move away from trading predominantly on daily returns data.
- Score: 2.580765958706854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning architectures, specifically Deep Momentum Networks (DMNs)
[1904.04912], have been found to be an effective approach to momentum and
mean-reversion trading. However, some of the key challenges in recent years
involve learning long-term dependencies, degradation of performance when
considering returns net of transaction costs and adapting to new market
regimes, notably during the SARS-CoV-2 crisis. Attention mechanisms, or
Transformer-based architectures, are a solution to such challenges because they
allow the network to focus on significant time steps in the past and
longer-term patterns. We introduce the Momentum Transformer, an attention-based
architecture which outperforms the benchmarks, and is inherently interpretable,
providing us with greater insights into our deep learning trading strategy. Our
model is an extension to the LSTM-based DMN, which directly outputs position
sizing by optimising the network on a risk-adjusted performance metric, such as
Sharpe ratio. We find an attention-LSTM hybrid Decoder-Only Temporal Fusion
Transformer (TFT) style architecture is the best performing model. In terms of
interpretability, we observe remarkable structure in the attention patterns,
with significant peaks of importance at momentum turning points. The time
series is thus segmented into regimes and the model tends to focus on previous
time-steps in alike regimes. We find changepoint detection (CPD) [2105.13727],
another technique for responding to regime change, can complement multi-headed
attention, especially when we run CPD at multiple timescales. Through the
addition of an interpretable variable selection network, we observe how CPD
helps our model to move away from trading predominantly on daily returns data.
We note that the model can intelligently switch between, and blend, classical
strategies - basing its decision on patterns in the data.
Related papers
- Todyformer: Towards Holistic Dynamic Graph Transformers with
Structure-Aware Tokenization [6.799413002613627]
Todyformer is a novel Transformer-based neural network tailored for dynamic graphs.
It unifies the local encoding capacity of Message-Passing Neural Networks (MPNNs) with the global encoding of Transformers.
We show that Todyformer consistently outperforms the state-of-the-art methods for downstream tasks.
arXiv Detail & Related papers (2024-02-02T23:05:30Z) - Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series
Forecasting Approach [71.67506068703314]
Long-term urban mobility predictions play a crucial role in the effective management of urban facilities and services.
Traditionally, urban mobility data has been structured as videos, treating longitude and latitude as fundamental pixels.
In our research, we introduce a fresh perspective on urban mobility prediction.
Instead of oversimplifying urban mobility data as traditional video data, we regard it as a complex time series.
arXiv Detail & Related papers (2023-12-04T07:39:05Z) - Copula Variational LSTM for High-dimensional Cross-market Multivariate
Dependence Modeling [46.75628526959982]
We make the first attempt to integrate variational sequential neural learning with copula-based dependence modeling.
Our variational neural network WPVC-VLSTM models variational sequential dependence degrees and structures across time series.
It outperforms benchmarks including linear models, volatility models, deep neural networks, and variational recurrent networks in cross-market portfolio forecasting.
arXiv Detail & Related papers (2023-05-09T08:19:08Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - Stock Trend Prediction: A Semantic Segmentation Approach [3.718476964451589]
We present a novel approach to predict long-term daily stock price change trends with fully 2D-convolutional encoder-decoders.
Our hierarchical structure of CNNs makes it capable of capturing both long and short-term temporal relationships effectively.
arXiv Detail & Related papers (2023-03-09T01:29:09Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Detecting and adapting to crisis pattern with context based Deep
Reinforcement Learning [6.224519494738852]
We present an innovative DRL framework consisting in two sub-networks fed respectively with portfolio strategies past performances and standard deviations as well as additional contextual features.
Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markowitz and is able to detect and anticipate crisis like the current Covid one.
arXiv Detail & Related papers (2020-09-07T12:11:08Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Deep Stock Predictions [58.720142291102135]
We consider the design of a trading strategy that performs portfolio optimization using Long Short Term Memory (LSTM) neural networks.
We then customize the loss function used to train the LSTM to increase the profit earned.
We find the LSTM model with the customized loss function to have an improved performance in the training bot over a regressive baseline such as ARIMA.
arXiv Detail & Related papers (2020-06-08T23:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.