Related papers: Parallel Spatio-Temporal Attention-Based TCN for Multivariate Time Series Prediction

Parallel Spatio-Temporal Attention-Based TCN for Multivariate Time Series Prediction

URL: http://arxiv.org/abs/2203.00971v1
Date: Wed, 2 Mar 2022 09:27:56 GMT
Title: Parallel Spatio-Temporal Attention-Based TCN for Multivariate Time Series Prediction
Authors: Fan Jin, Ke Zhang, Yipan Huang, Yifei Zhu, Baiping Chen
Abstract summary: A recurrent neural network with attention to help extend the prediction windows is the current-state-of-the-art for this task. We argue that their vanishing gradients, short memories, and serial architecture make RNNs fundamentally unsuited to long-horizon forecasting with complex data. We propose a framework called PSTA-TCN, that combines a paralleltemporal-temporal attention mechanism to extract dynamic internal correlations with stacked TCN backbones.
Score: 4.211344046281808
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As industrial systems become more complex and monitoring sensors for everything from surveillance to our health become more ubiquitous, multivariate time series prediction is taking an important place in the smooth-running of our society. A recurrent neural network with attention to help extend the prediction windows is the current-state-of-the-art for this task. However, we argue that their vanishing gradients, short memories, and serial architecture make RNNs fundamentally unsuited to long-horizon forecasting with complex data. Temporal convolutional networks (TCNs) do not suffer from gradient problems and they support parallel calculations, making them a more appropriate choice. Additionally, they have longer memories than RNNs, albeit with some instability and efficiency problems. Hence, we propose a framework, called PSTA-TCN, that combines a parallel spatio-temporal attention mechanism to extract dynamic internal correlations with stacked TCN backbones to extract features from different window sizes. The framework makes full use parallel calculations to dramatically reduce training times, while substantially increasing accuracy with stable prediction windows up to 13 times longer than the status quo.

Related papers

Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion [10.867398697751742]
We introduce BLUR (Bidirectional Linear Unit for Recurrent network), which uses forward and backward linear recurrent units (LRUs) to capture both past and future dependencies with high computational efficiency. Experiments on sequential image and time series datasets reveal that BLUR not only surpasses transformers and traditional RNNs in accuracy but also significantly reduces computational costs.
arXiv Detail & Related papers (2025-04-11T20:42:58Z)
Parallelized Spatiotemporal Binding [47.67393266882402]
We introduce Parallelizable Spatiotemporal Binder or PSB, the first temporally-parallelizable slot learning architecture for sequential inputs. Unlike conventional RNN-based approaches, PSB produces object-centric representations, known as slots, for all time-steps in parallel. Compared to the state-of-the-art, our architecture demonstrates stable training on longer sequences, achieves parallelization that results in a 60% increase in training speed, and yields performance that is on par with or better on unsupervised 2D and 3D object-centric scene decomposition and understanding.
arXiv Detail & Related papers (2024-02-26T23:16:34Z)
Time-Parameterized Convolutional Neural Networks for Irregularly Sampled Time Series [26.77596449192451]
Irregularly sampled time series are ubiquitous in several application domains, leading to sparse, not fully-observed and non-aligned observations. Standard sequential neural networks (RNNs) and convolutional neural networks (CNNs) consider regular spacing between observation times, posing significant challenges to irregular time series modeling. We parameterize convolutional layers by employing time-explicitly irregular kernels.
arXiv Detail & Related papers (2023-08-06T21:10:30Z)
A Distance Correlation-Based Approach to Characterize the Effectiveness of Recurrent Neural Networks for Time Series Forecasting [1.9950682531209158]
We provide an approach to link time series characteristics with RNN components via the versatile metric of distance correlation. We empirically show that the RNN activation layers learn the lag structures of time series well. We also show that the activation layers cannot adequately model moving average and heteroskedastic time series processes.
arXiv Detail & Related papers (2023-07-28T22:32:08Z)
Resurrecting Recurrent Neural Networks for Long Sequences [45.800920421868625]
Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are hard to optimize and slow to train. Deep state-space models (SSMs) have recently been shown to perform remarkably well on long sequence modeling tasks. We show that careful design of deep RNNs using standard signal propagation arguments can recover the impressive performance of deep SSMs on long-range reasoning tasks.
arXiv Detail & Related papers (2023-03-11T08:53:11Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Space-Time Graph Neural Networks with Stochastic Graph Perturbations [100.31591011966603]
Space-time graph neural networks (ST-GNNs) learn efficient graph representations of time-varying data. In this paper we revisit the properties of ST-GNNs and prove that they are stable to graph stabilitys. Our analysis suggests that ST-GNNs are suitable for transfer learning on time-varying graphs.
arXiv Detail & Related papers (2022-10-28T16:59:51Z)
Scalable Spatiotemporal Graph Neural Networks [14.415967477487692]
Graph neural networks (GNNs) are often the core component of the forecasting architecture. In most pretemporal GNNs, the computational complexity scales up to a quadratic factor with the length of the sequence times the number of links in the graph. We propose a scalable architecture that exploits an efficient encoding of both temporal and spatial dynamics.
arXiv Detail & Related papers (2022-09-14T09:47:38Z)
Space-Time Graph Neural Networks [104.55175325870195]
We introduce space-time graph neural network (ST-GNN) to jointly process the underlying space-time topology of time-varying network data. Our analysis shows that small variations in the network topology and time evolution of a system does not significantly affect the performance of ST-GNNs.
arXiv Detail & Related papers (2021-10-06T16:08:44Z)
Multi-Graph Convolutional-Recurrent Neural Network (MGC-RNN) for Short-Term Forecasting of Transit Passenger Flow [7.9132565523269625]
Short-term forecasting of passenger flow is critical for transit management and crowd regulation. An innovative deep learning approach, Multi-Graph Convolutional-Recurrent Neural Network (MGC-NN), is proposed to forecast passenger flow in urban rail transit systems.
arXiv Detail & Related papers (2021-07-28T08:41:12Z)
Spatio-Temporal Graph Scattering Transform [54.52797775999124]
Graph neural networks may be impractical in some real-world scenarios due to a lack of sufficient high-quality training data. We put forth a novel mathematically designed framework to analyze-temporal data.
arXiv Detail & Related papers (2020-12-06T19:49:55Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.