OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning
- URL: http://arxiv.org/abs/2306.11249v2
- Date: Wed, 18 Oct 2023 00:02:52 GMT
- Title: OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning
- Authors: Cheng Tan, Siyuan Li, Zhangyang Gao, Wenfei Guan, Zedong Wang, Zicheng
Liu, Lirong Wu, Stan Z. Li
- Abstract summary: We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models.
We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather.
We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
- Score: 67.07363529640784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatio-temporal predictive learning is a learning paradigm that enables
models to learn spatial and temporal patterns by predicting future frames from
given past frames in an unsupervised manner. Despite remarkable progress in
recent years, a lack of systematic understanding persists due to the diverse
settings, complex implementation, and difficult reproducibility. Without
standardization, comparisons can be unfair and insights inconclusive. To
address this dilemma, we propose OpenSTL, a comprehensive benchmark for
spatio-temporal predictive learning that categorizes prevalent approaches into
recurrent-based and recurrent-free models. OpenSTL provides a modular and
extensible framework implementing various state-of-the-art methods. We conduct
standard evaluations on datasets across various domains, including synthetic
moving object trajectory, human motion, driving scenes, traffic flow and
weather forecasting. Based on our observations, we provide a detailed analysis
of how model architecture and dataset properties affect spatio-temporal
predictive learning performance. Surprisingly, we find that recurrent-free
models achieve a good balance between efficiency and performance than recurrent
models. Thus, we further extend the common MetaFormers to boost recurrent-free
spatial-temporal predictive learning. We open-source the code and models at
https://github.com/chengtan9907/OpenSTL.
Related papers
- STLight: a Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal joint Processing [6.872340834265972]
We propose STLight, a novel method for S-temporal learning that relies solely on channel-wise and depth-wise convolutions as learnable layers.
STLight overcomes the limitations of traditional convolutional approaches by rearranging spatial and temporal dimensions together.
Our architecture achieves state-of-the-art performance on STL benchmarks across datasets and settings, while significantly improving computational efficiency in terms of parameters and computational FLOPs.
arXiv Detail & Related papers (2024-11-15T13:53:19Z) - Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors.
Existing approaches address these dimensions in isolation, neglecting their critical interdependencies.
In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z) - Multi-Modality Spatio-Temporal Forecasting via Self-Supervised Learning [11.19088022423885]
We propose a novel MoST learning framework via Self-Supervised Learning, namely MoSSL.
Results on two real-world MoST datasets verify the superiority of our approach compared with the state-of-the-art baselines.
arXiv Detail & Related papers (2024-05-06T08:24:06Z) - Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning
under A Unified View [73.73667848619343]
We introduce USTEP (Unified S-TEmporal Predictive learning), an innovative framework that reconciles the recurrent-based and recurrent-free methods by integrating both micro-temporal and macro-temporal scales.
arXiv Detail & Related papers (2023-10-09T16:17:42Z) - Disentangling Spatial and Temporal Learning for Efficient Image-to-Video
Transfer Learning [59.26623999209235]
We present DiST, which disentangles the learning of spatial and temporal aspects of videos.
The disentangled learning in DiST is highly efficient because it avoids the back-propagation of massive pre-trained parameters.
Extensive experiments on five benchmarks show that DiST delivers better performance than existing state-of-the-art methods by convincing gaps.
arXiv Detail & Related papers (2023-09-14T17:58:33Z) - Leaping Into Memories: Space-Time Deep Feature Synthesis [93.10032043225362]
We propose LEAPS, an architecture-independent method for synthesizing videos from internal models.
We quantitatively and qualitatively evaluate the applicability of LEAPS by inverting a range of architectures convolutional attention-based on Kinetics-400.
arXiv Detail & Related papers (2023-03-17T12:55:22Z) - TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction [64.63645677568384]
We introduce a novel saliency prediction model that learns to output saliency maps in sequential time intervals.
Our approach locally modulates the saliency predictions by combining the learned temporal maps.
Our code will be publicly available on GitHub.
arXiv Detail & Related papers (2023-01-05T22:10:16Z) - Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.