Convolutional State Space Models for Long-Range Spatiotemporal Modeling
- URL: http://arxiv.org/abs/2310.19694v1
- Date: Mon, 30 Oct 2023 16:11:06 GMT
- Title: Convolutional State Space Models for Long-Range Spatiotemporal Modeling
- Authors: Jimmy T.H. Smith, Shalini De Mello, Jan Kautz, Scott W. Linderman,
Wonmin Byeon
- Abstract summary: ConvS5 is an efficient variant for long-rangetemporal modeling.
It significantly outperforms Transformers and ConvNISTTM on a long horizon Moving-Lab experiment while training 3X faster than ConvLSTM and generating samples 400X faster than Transformers.
- Score: 65.0993000439043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effectively modeling long spatiotemporal sequences is challenging due to the
need to model complex spatial correlations and long-range temporal dependencies
simultaneously. ConvLSTMs attempt to address this by updating tensor-valued
states with recurrent neural networks, but their sequential computation makes
them slow to train. In contrast, Transformers can process an entire
spatiotemporal sequence, compressed into tokens, in parallel. However, the cost
of attention scales quadratically in length, limiting their scalability to
longer sequences. Here, we address the challenges of prior methods and
introduce convolutional state space models (ConvSSM) that combine the tensor
modeling ideas of ConvLSTM with the long sequence modeling approaches of state
space methods such as S4 and S5. First, we demonstrate how parallel scans can
be applied to convolutional recurrences to achieve subquadratic parallelization
and fast autoregressive generation. We then establish an equivalence between
the dynamics of ConvSSMs and SSMs, which motivates parameterization and
initialization strategies for modeling long-range dependencies. The result is
ConvS5, an efficient ConvSSM variant for long-range spatiotemporal modeling.
ConvS5 significantly outperforms Transformers and ConvLSTM on a long horizon
Moving-MNIST experiment while training 3X faster than ConvLSTM and generating
samples 400X faster than Transformers. In addition, ConvS5 matches or exceeds
the performance of state-of-the-art methods on challenging DMLab, Minecraft and
Habitat prediction benchmarks and enables new directions for modeling long
spatiotemporal sequences.
Related papers
- Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data.
We train the model using maximum likelihood estimation with Markov chain Monte Carlo.
Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z) - Machine-Learned Closure of URANS for Stably Stratified Turbulence: Connecting Physical Timescales & Data Hyperparameters of Deep Time-Series Models [0.0]
We develop time-series machine learning (ML) methods for closure modeling of the Unsteady Reynolds Averaged Navier Stokes equations.
We consider decaying SST which are homogeneous and stably stratified by a uniform density gradient.
We find that the ratio of the timescales of the minimum information required by the ML models to accurately capture the dynamics of the SST corresponds to the Reynolds number of the flow.
arXiv Detail & Related papers (2024-04-24T18:58:00Z) - LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences.
We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook.
LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z) - Deep Latent State Space Models for Time-Series Generation [68.45746489575032]
We propose LS4, a generative model for sequences with latent variables evolving according to a state space ODE.
Inspired by recent deep state space models (S4), we achieve speedups by leveraging a convolutional representation of LS4.
We show that LS4 significantly outperforms previous continuous-time generative models in terms of marginal distribution, classification, and prediction scores on real-world datasets.
arXiv Detail & Related papers (2022-12-24T15:17:42Z) - Liquid Structural State-Space Models [106.74783377913433]
Liquid-S4 achieves an average performance of 87.32% on the Long-Range Arena benchmark.
On the full raw Speech Command recognition, dataset Liquid-S4 achieves 96.78% accuracy with a 30% reduction in parameter counts compared to S4.
arXiv Detail & Related papers (2022-09-26T18:37:13Z) - Traversing Time with Multi-Resolution Gaussian Process State-Space
Models [17.42262122708566]
We propose a novel Gaussian process state-space architecture composed of multiple components, each trained on a different resolution, to model effects on different timescales.
We benchmark our novel method on semi-synthetic data and on an engine modeling task.
In both experiments, our approach compares favorably against its state-of-the-art alternatives that operate on a single time-scale only.
arXiv Detail & Related papers (2021-12-06T18:39:27Z) - Alternating ConvLSTM: Learning Force Propagation with Alternate State
Updates [29.011464047344614]
We introduce the alternating convolutional Long Short-Term Memory (Alt-ConvLSTM) that models the force propagation mechanisms in a deformable object with near-uniform material properties.
We demonstrate how this novel scheme imitates the alternate updates of the first and second-order terms in the forward method of numerical PDE solvers.
We validate our Alt-ConvLSTM on human soft tissue simulation with thousands of particles and consistent body pose changes.
arXiv Detail & Related papers (2020-06-14T06:43:33Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.