Richly Activated Graph Convolutional Network for Robust Skeleton-based
Action Recognition
- URL: http://arxiv.org/abs/2008.03791v2
- Date: Thu, 26 Nov 2020 02:07:30 GMT
- Title: Richly Activated Graph Convolutional Network for Robust Skeleton-based
Action Recognition
- Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang
- Abstract summary: A graph convolutional network (GCN) is proposed to explore sufficient discriminative features spreading over all skeleton joints.
The RA-GCN achieves comparable performance on the standard NTU RGB+D 60 and 120 datasets.
- Score: 22.90127409366107
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current methods for skeleton-based human action recognition usually work with
complete skeletons. However, in real scenarios, it is inevitable to capture
incomplete or noisy skeletons, which could significantly deteriorate the
performance of current methods when some informative joints are occluded or
disturbed. To improve the robustness of action recognition models, a
multi-stream graph convolutional network (GCN) is proposed to explore
sufficient discriminative features spreading over all skeleton joints, so that
the distributed redundant representation reduces the sensitivity of the action
models to non-standard skeletons. Concretely, the backbone GCN is extended by a
series of ordered streams which is responsible for learning discriminative
features from the joints less activated by preceding streams. Here, the
activation degrees of skeleton joints of each GCN stream are measured by the
class activation maps (CAM), and only the information from the unactivated
joints will be passed to the next stream, by which rich features over all
active joints are obtained. Thus, the proposed method is termed richly
activated GCN (RA-GCN). Compared to the state-of-the-art (SOTA) methods, the
RA-GCN achieves comparable performance on the standard NTU RGB+D 60 and 120
datasets. More crucially, on the synthetic occlusion and jittering datasets,
the performance deterioration due to the occluded and disturbed joints can be
significantly alleviated by utilizing the proposed RA-GCN.
Related papers
- DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Multi-Dimensional Refinement Graph Convolutional Network with Robust
Decouple Loss for Fine-Grained Skeleton-Based Action Recognition [19.031036881780107]
We propose a flexible attention block called Channel-Variable Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of spatial-temporal joints.
Based on CVSTA, we construct a Multi-Dimensional Refinement Graph Convolutional Network (MDR-GCN), which can improve the discrimination among channel-, joint- and frame-level features.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly boosts the effect of the CVSTA and reduces the impact of noise.
arXiv Detail & Related papers (2023-06-27T09:23:36Z) - Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness [24.83836008577395]
Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition.
They tend to optimize the adjacency matrix jointly with the model weights.
This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map.
We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
arXiv Detail & Related papers (2023-05-19T06:40:12Z) - DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action
Recognition [77.87404524458809]
We propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN)
It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling.
DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.
arXiv Detail & Related papers (2022-10-12T03:17:37Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - Skeleton-based Action Recognition via Adaptive Cross-Form Learning [75.92422282666767]
Skeleton-based action recognition aims to project skeleton sequences to action categories, where sequences are derived from multiple forms of pre-detected points.
Existing methods tend to improve GCNs by leveraging multi-form skeletons due to their complementary cues.
We present Adaptive Cross-Form Learning (ACFL), which empowers well-designed GCNs to generate complementary representation from single-form skeletons.
arXiv Detail & Related papers (2022-06-30T07:40:03Z) - SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition.
We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors.
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - JOLO-GCN: Mining Joint-Centered Light-Weight Information for
Skeleton-Based Action Recognition [47.47099206295254]
We propose a novel framework for employing human pose skeleton and joint-centered light-weight information jointly in a two-stream graph convolutional network.
Compared to the pure skeleton-based baseline, this hybrid scheme effectively boosts performance, while keeping the computational and memory overheads low.
arXiv Detail & Related papers (2020-11-16T08:39:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.