Transformers versus LSTMs for electronic trading
- URL: http://arxiv.org/abs/2309.11400v1
- Date: Wed, 20 Sep 2023 15:25:43 GMT
- Title: Transformers versus LSTMs for electronic trading
- Authors: Paul Bilokon and Yitao Qiu
- Abstract summary: This study investigates whether Transformer-based model can be applied in financial time series prediction and beat LSTM.
A new LSTM-based model called DLSTM is built and new architecture for the Transformer-based model is designed to adapt for financial prediction.
The experiment result reflects that the Transformer-based model only has the limited advantage in absolute price sequence prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of artificial intelligence, long short term memory
(LSTM), one kind of recurrent neural network (RNN), has been widely applied in
time series prediction.
Like RNN, Transformer is designed to handle the sequential data. As
Transformer achieved great success in Natural Language Processing (NLP),
researchers got interested in Transformer's performance on time series
prediction, and plenty of Transformer-based solutions on long time series
forecasting have come out recently. However, when it comes to financial time
series prediction, LSTM is still a dominant architecture. Therefore, the
question this study wants to answer is: whether the Transformer-based model can
be applied in financial time series prediction and beat LSTM.
To answer this question, various LSTM-based and Transformer-based models are
compared on multiple financial prediction tasks based on high-frequency limit
order book data. A new LSTM-based model called DLSTM is built and new
architecture for the Transformer-based model is designed to adapt for financial
prediction. The experiment result reflects that the Transformer-based model
only has the limited advantage in absolute price sequence prediction. The
LSTM-based models show better and more robust performance on difference
sequence prediction, such as price difference and price movement.
Related papers
- QuLTSF: Long-Term Time Series Forecasting with Quantum Machine Learning [4.2117721107606005]
Long-term time series forecasting involves predicting a large number of future values of a time series based on the past values.
Recent quantum machine learning (QML) is evolving as a domain to enhance the capabilities of classical machine learning models.
We show the advantages of QuLTSF over the state-of-the-art classical linear models, in terms of reduced mean squared error and mean absolute error.
arXiv Detail & Related papers (2024-12-18T12:06:52Z) - Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction [1.7273380623090846]
This work explores the integration of Transformer based model with Long Short-Term Memory (LSTM) based technique.
The proposed model is benchmarked against predecessor LSTM based methods, including STA-LSTM, SA-LSTM, CS-LSTM, and NaiveLSTM.
arXiv Detail & Related papers (2024-12-18T01:31:08Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - xLSTMTime : Long-term Time Series Forecasting With xLSTM [0.0]
This paper presents an adaptation of a recent architecture termed extended LSTM (xLSTM) for time series forecasting.
We compare xLSTMTime's performance against various state-of-the-art models across multiple real-world da-tasets.
Our findings suggest that refined recurrent architectures can offer competitive alternatives to transformer-based models in time series forecasting.
arXiv Detail & Related papers (2024-07-14T15:15:00Z) - Probing the limit of hydrologic predictability with the Transformer
network [7.326504492614808]
We show that a vanilla Transformer architecture is not competitive against LSTM on the widely benchmarked CAMELS dataset.
A recurrence-free variant of Transformer can obtain mixed comparisons with LSTM, producing the same Kling-Gupta efficiency coefficient (KGE) along with other metrics.
While the Transformer results are not higher than current state-of-the-art, we still learned some valuable lessons.
arXiv Detail & Related papers (2023-06-21T17:06:54Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Learning Bounded Context-Free-Grammar via LSTM and the
Transformer:Difference and Explanations [51.77000472945441]
Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks.
In practice, it is often observed that Transformer models have better representation power than LSTM.
We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.
arXiv Detail & Related papers (2021-12-16T19:56:44Z) - Wake Word Detection with Streaming Transformers [72.66551640048405]
We show that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate.
Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25%.
arXiv Detail & Related papers (2021-02-08T19:14:32Z) - Future Vector Enhanced LSTM Language Model for LVCSR [67.03726018635174]
This paper proposes a novel enhanced long short-term memory (LSTM) LM using the future vector.
Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction.
Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.
arXiv Detail & Related papers (2020-07-31T08:38:56Z) - Transformer Networks for Trajectory Forecasting [11.802437934289062]
We propose the novel use of Transformer Networks for trajectory forecasting.
This is a fundamental switch from the sequential step-by-step processing of LSTMs to the only-attention-based memory mechanisms of Transformers.
arXiv Detail & Related papers (2020-03-18T09:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.