Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness
- URL: http://arxiv.org/abs/2305.11468v3
- Date: Mon, 4 Mar 2024 13:29:18 GMT
- Title: Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness
- Authors: Yuxuan Zhou, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Yifeng Geng, Xuansong
Xie
- Abstract summary: Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition.
They tend to optimize the adjacency matrix jointly with the model weights.
This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map.
We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
- Score: 24.83836008577395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in
skeleton-based action recognition, leveraging their ability to unravel the
complex dynamics of human joint topology through the graph's adjacency matrix.
However, an inherent flaw has come to light in these cutting-edge models: they
tend to optimize the adjacency matrix jointly with the model weights. This
process, while seemingly efficient, causes a gradual decay of bone connectivity
data, culminating in a model indifferent to the very topology it sought to map.
As a remedy, we propose a threefold strategy: (1) We forge an innovative
pathway that encodes bone connectivity by harnessing the power of graph
distances. This approach preserves the vital topological nuances often lost in
conventional GCNs. (2) We highlight an oft-overlooked feature - the temporal
mean of a skeletal sequence, which, despite its modest guise, carries highly
action-specific information. (3) Our investigation revealed strong variations
in joint-to-joint relationships across different actions. This finding exposes
the limitations of a single adjacency matrix in capturing the variations of
relational configurations emblematic of human movement, which we remedy by
proposing an efficient refinement to Graph Convolutions (GC) - the BlockGC.
This evolution slashes parameters by a substantial margin (above 40%), while
elevating performance beyond original GCNs. Our full model, the BlockGCN,
establishes new standards in skeleton-based action recognition for small model
sizes. Its high accuracy, notably on the large-scale NTU RGB+D 120 dataset,
stand as compelling proof of the efficacy of BlockGCN.
Related papers
- Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition [11.05325139231301]
Skeleton-based action recognition has achieved remarkable performance with the development of graph convolutional networks (GCNs)
We propose a novel Topological Symmetry Enhanced Graph Convolution (TSE-GC) to enable distinct topology learning across different channel partitions.
We also construct a Multi-Branch Deformable Temporal Convolution (MBDTC) for skeleton-based action recognition.
arXiv Detail & Related papers (2024-11-19T15:23:59Z) - Hypergraph Transformer for Skeleton-based Action Recognition [21.763844802116857]
Skeleton-based action recognition aims to recognize human actions given human joint coordinates with skeletal interconnections.
Previous works successfully adopted Graph Convolutional networks (GCNs) to model joint co-occurrences.
We propose a new self-attention (SA) mechanism on hypergraph, termed Hypergraph Self-Attention (HyperSA), to incorporate intrinsic higher-order relations into the model.
arXiv Detail & Related papers (2022-11-17T15:36:48Z) - DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action
Recognition [77.87404524458809]
We propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN)
It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling.
DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.
arXiv Detail & Related papers (2022-10-12T03:17:37Z) - SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition.
We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors.
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z) - Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based
Action Recognition [49.163326827954656]
We propose a novel multi-granular-temporal graph network for skeleton-based action classification.
We develop a dual-head graph network consisting of two inter-leaved branches, which enables us to extract at least two-temporal resolutions.
We conduct extensive experiments on three large-scale datasets.
arXiv Detail & Related papers (2021-08-10T09:25:07Z) - Multi Scale Temporal Graph Networks For Skeleton-based Action
Recognition [5.970574258839858]
Graph convolutional networks (GCNs) can effectively capture the features of related nodes and improve the performance of the model.
Existing methods based on GCNs have two problems. First, the consistency of temporal and spatial features is ignored for extracting features node by node and frame by frame.
We propose a novel model called Temporal Graph Networks (TGN) for action recognition.
arXiv Detail & Related papers (2020-12-05T08:08:25Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z) - Structure-Aware Human-Action Generation [126.05874420893092]
Graph convolutional networks (GCNs) are promising way to leverage structure information to learn structure representations.
We propose a variant of GCNs to leverage the powerful self-attention mechanism to adaptively sparsify a complete action graph in the temporal space.
Our method could dynamically attend to important past frames and construct a sparse graph to apply in the GCN framework, well-capturing the structure information in action sequences.
arXiv Detail & Related papers (2020-07-04T00:18:27Z) - Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D.
By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.