A Comprehensive Study on Temporal Modeling for Online Action Detection
- URL: http://arxiv.org/abs/2001.07501v1
- Date: Tue, 21 Jan 2020 13:12:58 GMT
- Title: A Comprehensive Study on Temporal Modeling for Online Action Detection
- Authors: Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng
- Abstract summary: Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.
This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods.
We present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries.
- Score: 50.558313106389335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online action detection (OAD) is a practical yet challenging task, which has
attracted increasing attention in recent years. A typical OAD system mainly
consists of three modules: a frame-level feature extractor which is usually
based on pre-trained deep Convolutional Neural Networks (CNNs), a temporal
modeling module, and an action classifier. Among them, the temporal modeling
module is crucial which aggregates discriminative information from historical
and current features. Though many temporal modeling methods have been developed
for OAD and other topics, their effects are lack of investigation on OAD
fairly. This paper aims to provide a comprehensive study on temporal modeling
for OAD including four meta types of temporal modeling methods, \ie temporal
pooling, temporal convolution, recurrent neural networks, and temporal
attention, and uncover some good practices to produce a state-of-the-art OAD
system. Many of them are explored in OAD for the first time, and extensively
evaluated with various hyper parameters. Furthermore, based on our
comprehensive study, we present several hybrid temporal modeling methods, which
outperform the recent state-of-the-art methods with sizable margins on
THUMOS-14 and TVSeries.
Related papers
- Joint Selective State Space Model and Detrending for Robust Time Series Anomaly Detection [25.60381244912307]
Deep learning-based sequence models are extensively employed in Time Series Anomaly Detection tasks.
The ability of TSAD is limited by two key challenges: (i) the ability to model long-range dependency and (ii) the generalization issue in the presence of non-stationary data.
arXiv Detail & Related papers (2024-05-30T08:31:18Z) - Multi-Modality Spatio-Temporal Forecasting via Self-Supervised Learning [11.19088022423885]
We propose a novel MoST learning framework via Self-Supervised Learning, namely MoSSL.
Results on two real-world MoST datasets verify the superiority of our approach compared with the state-of-the-art baselines.
arXiv Detail & Related papers (2024-05-06T08:24:06Z) - Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning
under A Unified View [73.73667848619343]
We introduce USTEP (Unified S-TEmporal Predictive learning), an innovative framework that reconciles the recurrent-based and recurrent-free methods by integrating both micro-temporal and macro-temporal scales.
arXiv Detail & Related papers (2023-10-09T16:17:42Z) - A Neural PDE Solver with Temporal Stencil Modeling [44.97241931708181]
Recent Machine Learning (ML) models have shown new promises in capturing important dynamics in high-resolution signals.
This study shows that significant information is often lost in the low-resolution down-sampled features.
We propose a new approach, which combines the strengths of advanced time-series sequence modeling and state-of-the-art neural PDE solvers.
arXiv Detail & Related papers (2023-02-16T06:13:01Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.