Related papers: Transformers versus LSTMs for electronic trading

Transformers versus LSTMs for electronic trading

URL: http://arxiv.org/abs/2309.11400v1
Date: Wed, 20 Sep 2023 15:25:43 GMT
Title: Transformers versus LSTMs for electronic trading
Authors: Paul Bilokon and Yitao Qiu
Abstract summary: This study investigates whether Transformer-based model can be applied in financial time series prediction and beat LSTM. A new LSTM-based model called DLSTM is built and new architecture for the Transformer-based model is designed to adapt for financial prediction. The experiment result reflects that the Transformer-based model only has the limited advantage in absolute price sequence prediction.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid development of artificial intelligence, long short term memory (LSTM), one kind of recurrent neural network (RNN), has been widely applied in time series prediction. Like RNN, Transformer is designed to handle the sequential data. As Transformer achieved great success in Natural Language Processing (NLP), researchers got interested in Transformer's performance on time series prediction, and plenty of Transformer-based solutions on long time series forecasting have come out recently. However, when it comes to financial time series prediction, LSTM is still a dominant architecture. Therefore, the question this study wants to answer is: whether the Transformer-based model can be applied in financial time series prediction and beat LSTM. To answer this question, various LSTM-based and Transformer-based models are compared on multiple financial prediction tasks based on high-frequency limit order book data. A new LSTM-based model called DLSTM is built and new architecture for the Transformer-based model is designed to adapt for financial prediction. The experiment result reflects that the Transformer-based model only has the limited advantage in absolute price sequence prediction. The LSTM-based models show better and more robust performance on difference sequence prediction, such as price difference and price movement.

Related papers

Fusing Large Language Models with Temporal Transformers for Time Series Forecasting [17.549938378193282]
Large language models (LLMs) have demonstrated powerful capabilities in performing various tasks.<n>LLMs are proficient at reasoning over discrete tokens and semantic patterns.<n> vanilla Transformers often struggle to learn high-level semantic patterns.
arXiv Detail & Related papers (2025-07-14T09:33:40Z)
QuLTSF: Long-Term Time Series Forecasting with Quantum Machine Learning [4.2117721107606005]
Long-term time series forecasting (LTSF) involves predicting a large number of future values of a time series based on the past values. Recent quantum machine learning (QML) is evolving as a domain to enhance the capabilities of classical machine learning models. We show the advantages of QuLTSF over the state-of-the-art classical linear models, in terms of reduced mean squared error and mean absolute error.
arXiv Detail & Related papers (2024-12-18T12:06:52Z)
Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction [1.7273380623090846]
This work explores the integration of Transformer based model with Long Short-Term Memory (LSTM) based technique. The proposed model is benchmarked against predecessor LSTM based methods, including STA-LSTM, SA-LSTM, CS-LSTM, and NaiveLSTM.
arXiv Detail & Related papers (2024-12-18T01:31:08Z)
LSEAttention is All You Need for Time Series Forecasting [0.0]
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. I introduce textbfLSEAttention, an approach designed to address entropy collapse and training instability commonly observed in transformer models.
arXiv Detail & Related papers (2024-10-31T09:09:39Z)
PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction. We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences. We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z)
Beam Prediction based on Large Language Models [51.45077318268427]
Millimeter-wave (mmWave) communication is promising for next-generation wireless networks but suffers from significant path loss. Traditional deep learning models, such as long short-term memory (LSTM), enhance beam tracking accuracy however are limited by poor robustness and generalization. In this letter, we use large language models (LLMs) to improve the robustness of beam prediction.
arXiv Detail & Related papers (2024-08-16T12:40:01Z)
xLSTMTime : Long-term Time Series Forecasting With xLSTM [0.0]
This paper presents an adaptation of a recent architecture termed extended LSTM (xLSTM) for time series forecasting. We compare xLSTMTime's performance against various state-of-the-art models across multiple real-world da-tasets. Our findings suggest that refined recurrent architectures can offer competitive alternatives to transformer-based models in time series forecasting.
arXiv Detail & Related papers (2024-07-14T15:15:00Z)
Probing the limit of hydrologic predictability with the Transformer network [7.326504492614808]
We show that a vanilla Transformer architecture is not competitive against LSTM on the widely benchmarked CAMELS dataset. A recurrence-free variant of Transformer can obtain mixed comparisons with LSTM, producing the same Kling-Gupta efficiency coefficient (KGE) along with other metrics. While the Transformer results are not higher than current state-of-the-art, we still learned some valuable lessons.
arXiv Detail & Related papers (2023-06-21T17:06:54Z)
Temporal Fusion Transformers for Streamflow Prediction: Value of Combining Attention with Recurrence [0.0]
This work tests the hypothesis that combining recurrence with attention can improve streamflow prediction. We set up the Temporal Fusion Transformer (TFT) architecture, a model that combines both of these aspects and has never been applied in hydrology before. Our results demonstrate that TFT indeed exceeds the performance benchmark set by the LSTM and Transformers for streamflow prediction.
arXiv Detail & Related papers (2023-05-21T03:58:16Z)
CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting. First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals. Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions. Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
Learning Bounded Context-Free-Grammar via LSTM and the Transformer:Difference and Explanations [51.77000472945441]
Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks. In practice, it is often observed that Transformer models have better representation power than LSTM. We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.
arXiv Detail & Related papers (2021-12-16T19:56:44Z)
Wake Word Detection with Streaming Transformers [72.66551640048405]
We show that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate. Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25%.
arXiv Detail & Related papers (2021-02-08T19:14:32Z)
Future Vector Enhanced LSTM Language Model for LVCSR [67.03726018635174]
This paper proposes a novel enhanced long short-term memory (LSTM) LM using the future vector. Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction. Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.
arXiv Detail & Related papers (2020-07-31T08:38:56Z)
Transformer Networks for Trajectory Forecasting [11.802437934289062]
We propose the novel use of Transformer Networks for trajectory forecasting. This is a fundamental switch from the sequential step-by-step processing of LSTMs to the only-attention-based memory mechanisms of Transformers.
arXiv Detail & Related papers (2020-03-18T09:17:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.