Dynamic Spatial-temporal Hypergraph Convolutional Network for
Skeleton-based Action Recognition
- URL: http://arxiv.org/abs/2302.08689v1
- Date: Fri, 17 Feb 2023 04:42:19 GMT
- Title: Dynamic Spatial-temporal Hypergraph Convolutional Network for
Skeleton-based Action Recognition
- Authors: Shengqin Wang, Yongji Zhang, Hong Qi, Minghao Zhao, Yu Jiang
- Abstract summary: Skeleton-based action recognition relies on the extraction of spatial-temporal topological information.
This paper proposes a dynamic spatial-temporal hypergraph convolutional network (DST-HCN) to capture spatial-temporal information for skeleton-based action recognition.
- Score: 4.738525281379023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Skeleton-based action recognition relies on the extraction of
spatial-temporal topological information. Hypergraphs can establish prior
unnatural dependencies for the skeleton. However, the existing methods only
focus on the construction of spatial topology and ignore the time-point
dependence. This paper proposes a dynamic spatial-temporal hypergraph
convolutional network (DST-HCN) to capture spatial-temporal information for
skeleton-based action recognition. DST-HCN introduces a time-point hypergraph
(TPH) to learn relationships at time points. With multiple spatial static
hypergraphs and dynamic TPH, our network can learn more complete
spatial-temporal features. In addition, we use the high-order information
fusion module (HIF) to fuse spatial-temporal information synchronously.
Extensive experiments on NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets show
that our model achieves state-of-the-art, especially compared with hypergraph
methods.
Related papers
- TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes [63.95928298690001]
We present TPP-Gaze, a novel and principled approach to model scanpath dynamics based on Neural Temporal Point Process (TPP)
Our results show the overall superior performance of the proposed model compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-30T19:22:38Z) - Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition [0.0]
In this paper, we propose self-attention GCN hybrid model, Multi-Scale Spatial-Temporal self-attention (MSST)-GCN.
We utilize spatial self-attention module with adaptive topology to understand intra-frame interactions within a frame among different body parts, and temporal self-attention module to examine correlations between frames of a node.
arXiv Detail & Related papers (2024-04-03T10:25:45Z) - Exploiting Spatial-temporal Data for Sleep Stage Classification via
Hypergraph Learning [16.802013781690402]
We propose a dynamic learning framework STHL, which introduces hypergraph to encode spatial-temporal data for sleep stage classification.
Our proposed STHL outperforms the state-of-the-art models in sleep stage classification tasks.
arXiv Detail & Related papers (2023-09-05T11:01:30Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - An Adaptive Federated Relevance Framework for Spatial Temporal Graph
Learning [14.353798949041698]
We propose an adaptive federated relevance framework, namely FedRel, for spatial-temporal graph learning.
The core Dynamic Inter-Intra Graph (DIIG) module in the framework is able to use these features to generate the spatial-temporal graphs.
To improve the model generalization ability and performance while preserving the local data privacy, we also design a relevance-driven federated learning module.
arXiv Detail & Related papers (2022-06-07T16:12:17Z) - Skeleton-based Action Recognition via Temporal-Channel Aggregation [5.620303498964992]
We propose a Temporal-Channel Aggregation Graph Conal Networks (TCA-CN) to learn spatial and temporal topologies.
In addition, we extract multi-scale skeletal temporal modeling and fuse them with priori skeletal knowledge with an attention mechanism.
arXiv Detail & Related papers (2022-05-31T16:28:30Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Temporal Graph Modeling for Skeleton-based Action Recognition [25.788239844759246]
We propose a Temporal Enhanced Graph Convolutional Network (TE-GCN) to capture complex temporal dynamic.
The constructed temporal relation graph explicitly builds connections between semantically related temporal features.
Experiments are performed on two widely used large-scale datasets.
arXiv Detail & Related papers (2020-12-16T09:02:47Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - On the spatial attention in Spatio-Temporal Graph Convolutional Networks
for skeleton-based human action recognition [97.14064057840089]
Graphal networks (GCNs) promising performance in skeleton-based human action recognition by modeling a sequence of skeletons as a graph.
Most of the recently proposed G-temporal-based methods improve the performance by learning the graph structure at each layer of the network.
arXiv Detail & Related papers (2020-11-07T19:03:04Z) - Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D.
By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.