Temporal Fusion Transformers for Streamflow Prediction: Value of
Combining Attention with Recurrence
- URL: http://arxiv.org/abs/2305.12335v1
- Date: Sun, 21 May 2023 03:58:16 GMT
- Title: Temporal Fusion Transformers for Streamflow Prediction: Value of
Combining Attention with Recurrence
- Authors: Sinan Rasiya Koya and Tirthankar Roy
- Abstract summary: This work tests the hypothesis that combining recurrence with attention can improve streamflow prediction.
We set up the Temporal Fusion Transformer (TFT) architecture, a model that combines both of these aspects and has never been applied in hydrology before.
Our results demonstrate that TFT indeed exceeds the performance benchmark set by the LSTM and Transformers for streamflow prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past few decades, the hydrology community has witnessed notable
advancements in streamflow prediction, particularly with the introduction of
cutting-edge machine-learning algorithms. Recurrent neural networks, especially
Long Short-Term Memory (LSTM) networks, have become popular due to their
capacity to create precise forecasts and realistically mimic the system
dynamics. Attention-based models, such as Transformers, can learn from the
entire data sequence concurrently, a feature that LSTM does not have. This work
tests the hypothesis that combining recurrence with attention can improve
streamflow prediction. We set up the Temporal Fusion Transformer (TFT)
architecture, a model that combines both of these aspects and has never been
applied in hydrology before. We compare the performance of LSTM, Transformers,
and TFT over 2,610 globally distributed catchments from the recently available
Caravan dataset. Our results demonstrate that TFT indeed exceeds the
performance benchmark set by the LSTM and Transformers for streamflow
prediction. Additionally, being an explainable AI method, TFT helps in gaining
insights into the streamflow generation processes.
Related papers
- PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - A Parsimonious Setup for Streamflow Forecasting using CNN-LSTM [0.0]
We extend the application of CNN-LSTMs to time series settings, leveraging lagged streamflow data to predict streamflow.
Our results show a substantial improvement in predictive performance in 21 out of 32 HUC8 basins in Nebraska.
arXiv Detail & Related papers (2024-04-11T17:10:57Z) - Probing the limit of hydrologic predictability with the Transformer
network [7.326504492614808]
We show that a vanilla Transformer architecture is not competitive against LSTM on the widely benchmarked CAMELS dataset.
A recurrence-free variant of Transformer can obtain mixed comparisons with LSTM, producing the same Kling-Gupta efficiency coefficient (KGE) along with other metrics.
While the Transformer results are not higher than current state-of-the-art, we still learned some valuable lessons.
arXiv Detail & Related papers (2023-06-21T17:06:54Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Predicting the temporal dynamics of turbulent channels through deep
learning [0.0]
We aim to assess the capability of neural networks to reproduce the temporal evolution of a minimal turbulent channel flow.
Long-short-term-memory (LSTM) networks and a Koopman-based framework (KNF) are trained to predict the temporal dynamics of the minimal-channel-flow modes.
arXiv Detail & Related papers (2022-03-02T09:31:03Z) - Learning Bounded Context-Free-Grammar via LSTM and the
Transformer:Difference and Explanations [51.77000472945441]
Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks.
In practice, it is often observed that Transformer models have better representation power than LSTM.
We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.
arXiv Detail & Related papers (2021-12-16T19:56:44Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z) - Machine Learning for Postprocessing Ensemble Streamflow Forecasts [0.0]
We integrate dynamical modeling with machine learning to demonstrate the enhanced quality of streamflow forecasts at short-to medium-range (1 - 7 days)
We employ a Long Short-Term Memory (LSTM) neural network to correct forecast biases in raw ensemble streamflow forecasts obtained from dynamical modeling.
The verification results show that the LSTM can improve streamflow forecasts relative to climatological, temporal persistence, deterministic, and raw ensemble forecasts.
arXiv Detail & Related papers (2021-06-15T18:46:30Z) - Wake Word Detection with Streaming Transformers [72.66551640048405]
We show that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate.
Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25%.
arXiv Detail & Related papers (2021-02-08T19:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.