GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition
- URL: http://arxiv.org/abs/2403.15212v2
- Date: Sun, 26 May 2024 19:21:58 GMT
- Title: GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition
- Authors: Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni,
- Abstract summary: Skeleton-based action recognition in videos is an important but challenging task in computer vision.
We propose a principled and parsimonious representation for sequential data by leveraging the Lie group structure.
Our proposed GCN-DevLSTM network consistently improves the strong GCN baseline models and achieves SOTA results with superior robustness in SAR tasks.
- Score: 10.562869805151411
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision. The recent state-of-the-art (SOTA) models for SAR are primarily based on graph convolutional neural networks (GCNs), which are powerful in extracting the spatial information of skeleton data. However, it is yet clear that such GCN-based models can effectively capture the temporal dynamics of human action sequences. To this end, we propose the G-Dev layer, which exploits the path development -- a principled and parsimonious representation for sequential data by leveraging the Lie group structure. By integrating the G-Dev layer, the hybrid G-DevLSTM module enhances the traditional LSTM to reduce the time dimension while retaining high-frequency information. It can be conveniently applied to any temporal graph data, complementing existing advanced GCN-based models. Our empirical studies on the NTU60, NTU120 and Chalearn2013 datasets demonstrate that our proposed GCN-DevLSTM network consistently improves the strong GCN baseline models and achieves SOTA results with superior robustness in SAR tasks. The code is available at https://github.com/DeepIntoStreams/GCN-DevLSTM.
Related papers
- DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning [38.53424185696828]
The representation learning of Discrete-Time Dynamic Graphs (DTDGs) has been extensively applied to model the dynamics of temporally changing entities and their evolving connections.
This paper introduces a novel representation learning method DTFormer for DTDGs, pivoting from the traditional GNN+RNN framework to a Transformer-based architecture.
arXiv Detail & Related papers (2024-07-26T05:46:23Z) - A Generative Self-Supervised Framework using Functional Connectivity in
fMRI Data [15.211387244155725]
Deep neural networks trained on Functional Connectivity (FC) networks extracted from functional Magnetic Resonance Imaging (fMRI) data have gained popularity.
Recent research on the application of Graph Neural Network (GNN) to FC suggests that exploiting the time-varying properties of the FC could significantly improve the accuracy and interpretability of the model prediction.
High cost of acquiring high-quality fMRI data and corresponding labels poses a hurdle to their application in real-world settings.
We propose a generative SSL approach that is tailored to effectively harnesstemporal information within dynamic FC.
arXiv Detail & Related papers (2023-12-04T16:14:43Z) - DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action
Recognition [77.87404524458809]
We propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN)
It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling.
DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.
arXiv Detail & Related papers (2022-10-12T03:17:37Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - Multi-Scale Semantics-Guided Neural Networks for Efficient
Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition.
MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed.
LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs.
We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z) - Data-Driven Learning of Geometric Scattering Networks [74.3283600072357]
We propose a new graph neural network (GNN) module based on relaxations of recently proposed geometric scattering transforms.
Our learnable geometric scattering (LEGS) module enables adaptive tuning of the wavelets to encourage band-pass features to emerge in learned representations.
arXiv Detail & Related papers (2020-10-06T01:20:27Z) - Feedback Graph Convolutional Network for Skeleton-based Action
Recognition [38.782491442635205]
We propose a novel network, named Feedback Graph Convolutional Network (FGCN)
This is the first work that introduces the feedback mechanism into GCNs and action recognition.
It achieves the state-of-the-art performance on three datasets.
arXiv Detail & Related papers (2020-03-17T07:20:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.