Transformers versus LSTMs for electronic trading
- URL: http://arxiv.org/abs/2309.11400v1
- Date: Wed, 20 Sep 2023 15:25:43 GMT
- Title: Transformers versus LSTMs for electronic trading
- Authors: Paul Bilokon and Yitao Qiu
- Abstract summary: This study investigates whether Transformer-based model can be applied in financial time series prediction and beat LSTM.
A new LSTM-based model called DLSTM is built and new architecture for the Transformer-based model is designed to adapt for financial prediction.
The experiment result reflects that the Transformer-based model only has the limited advantage in absolute price sequence prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of artificial intelligence, long short term memory
(LSTM), one kind of recurrent neural network (RNN), has been widely applied in
time series prediction.
Like RNN, Transformer is designed to handle the sequential data. As
Transformer achieved great success in Natural Language Processing (NLP),
researchers got interested in Transformer's performance on time series
prediction, and plenty of Transformer-based solutions on long time series
forecasting have come out recently. However, when it comes to financial time
series prediction, LSTM is still a dominant architecture. Therefore, the
question this study wants to answer is: whether the Transformer-based model can
be applied in financial time series prediction and beat LSTM.
To answer this question, various LSTM-based and Transformer-based models are
compared on multiple financial prediction tasks based on high-frequency limit
order book data. A new LSTM-based model called DLSTM is built and new
architecture for the Transformer-based model is designed to adapt for financial
prediction. The experiment result reflects that the Transformer-based model
only has the limited advantage in absolute price sequence prediction. The
LSTM-based models show better and more robust performance on difference
sequence prediction, such as price difference and price movement.
Related papers
- LSEAttention is All You Need for Time Series Forecasting [0.0]
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision.
I introduce textbfLSEAttention, an approach designed to address entropy collapse and training instability commonly observed in transformer models.
arXiv Detail & Related papers (2024-10-31T09:09:39Z) - Beam Prediction based on Large Language Models [51.45077318268427]
Millimeter-wave (mmWave) communication is promising for next-generation wireless networks but suffers from significant path loss.
Traditional deep learning models, such as long short-term memory (LSTM), enhance beam tracking accuracy however are limited by poor robustness and generalization.
In this letter, we use large language models (LLMs) to improve the robustness of beam prediction.
arXiv Detail & Related papers (2024-08-16T12:40:01Z) - xLSTMTime : Long-term Time Series Forecasting With xLSTM [0.0]
This paper presents an adaptation of a recent architecture termed extended LSTM (xLSTM) for time series forecasting.
We compare xLSTMTime's performance against various state-of-the-art models across multiple real-world da-tasets.
Our findings suggest that refined recurrent architectures can offer competitive alternatives to transformer-based models in time series forecasting.
arXiv Detail & Related papers (2024-07-14T15:15:00Z) - Probing the limit of hydrologic predictability with the Transformer
network [7.326504492614808]
We show that a vanilla Transformer architecture is not competitive against LSTM on the widely benchmarked CAMELS dataset.
A recurrence-free variant of Transformer can obtain mixed comparisons with LSTM, producing the same Kling-Gupta efficiency coefficient (KGE) along with other metrics.
While the Transformer results are not higher than current state-of-the-art, we still learned some valuable lessons.
arXiv Detail & Related papers (2023-06-21T17:06:54Z) - Temporal Fusion Transformers for Streamflow Prediction: Value of
Combining Attention with Recurrence [0.0]
This work tests the hypothesis that combining recurrence with attention can improve streamflow prediction.
We set up the Temporal Fusion Transformer (TFT) architecture, a model that combines both of these aspects and has never been applied in hydrology before.
Our results demonstrate that TFT indeed exceeds the performance benchmark set by the LSTM and Transformers for streamflow prediction.
arXiv Detail & Related papers (2023-05-21T03:58:16Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Learning Bounded Context-Free-Grammar via LSTM and the
Transformer:Difference and Explanations [51.77000472945441]
Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks.
In practice, it is often observed that Transformer models have better representation power than LSTM.
We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.
arXiv Detail & Related papers (2021-12-16T19:56:44Z) - Wake Word Detection with Streaming Transformers [72.66551640048405]
We show that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate.
Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25%.
arXiv Detail & Related papers (2021-02-08T19:14:32Z) - Future Vector Enhanced LSTM Language Model for LVCSR [67.03726018635174]
This paper proposes a novel enhanced long short-term memory (LSTM) LM using the future vector.
Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction.
Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.
arXiv Detail & Related papers (2020-07-31T08:38:56Z) - Transformer Networks for Trajectory Forecasting [11.802437934289062]
We propose the novel use of Transformer Networks for trajectory forecasting.
This is a fundamental switch from the sequential step-by-step processing of LSTMs to the only-attention-based memory mechanisms of Transformers.
arXiv Detail & Related papers (2020-03-18T09:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.