Skeleton-based Action Recognition via Adaptive Cross-Form Learning
- URL: http://arxiv.org/abs/2206.15085v1
- Date: Thu, 30 Jun 2022 07:40:03 GMT
- Title: Skeleton-based Action Recognition via Adaptive Cross-Form Learning
- Authors: Xuanhan Wang, Yan Dai, Lianli Gao, Jingkuan Song
- Abstract summary: Skeleton-based action recognition aims to project skeleton sequences to action categories, where sequences are derived from multiple forms of pre-detected points.
Existing methods tend to improve GCNs by leveraging multi-form skeletons due to their complementary cues.
We present Adaptive Cross-Form Learning (ACFL), which empowers well-designed GCNs to generate complementary representation from single-form skeletons.
- Score: 75.92422282666767
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Skeleton-based action recognition aims to project skeleton sequences to
action categories, where skeleton sequences are derived from multiple forms of
pre-detected points. Compared with earlier methods that focus on exploring
single-form skeletons via Graph Convolutional Networks (GCNs), existing methods
tend to improve GCNs by leveraging multi-form skeletons due to their
complementary cues. However, these methods (either adapting structure of GCNs
or model ensemble) require the co-existence of all forms of skeletons during
both training and inference stages, while a typical situation in real life is
the existence of only partial forms for inference. To tackle this issue, we
present Adaptive Cross-Form Learning (ACFL), which empowers well-designed GCNs
to generate complementary representation from single-form skeletons without
changing model capacity. Specifically, each GCN model in ACFL not only learns
action representation from the single-form skeletons, but also adaptively
mimics useful representations derived from other forms of skeletons. In this
way, each GCN can learn how to strengthen what has been learned, thus
exploiting model potential and facilitating action recognition as well.
Extensive experiments conducted on three challenging benchmarks, i.e.,
NTU-RGB+D 120, NTU-RGB+D 60 and UAV-Human, demonstrate the effectiveness and
generalizability of the proposed method. Specifically, the ACFL significantly
improves various GCN models (i.e., CTR-GCN, MS-G3D, and Shift-GCN), achieving a
new record for skeleton-based action recognition.
Related papers
- Hierarchical Skeleton Meta-Prototype Contrastive Learning with Hard
Skeleton Mining for Unsupervised Person Re-Identification [70.90142717649785]
This paper proposes a generic unsupervised Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) approach with Hard Skeleton Mining (HSM) for person re-ID with unlabeled 3D skeletons.
By converting original prototypes into meta-prototypes with multiple homogeneous transformations, we induce the model to learn the inherent consistency of prototypes to capture more effective skeleton features for person re-ID.
arXiv Detail & Related papers (2023-07-24T16:18:22Z) - SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence
Pre-training [110.55093254677638]
We propose an efficient skeleton sequence learning framework, named Skeleton Sequence Learning (SSL)
In this paper, we build an asymmetric graph-based encoder-decoder pre-training architecture named SkeletonMAE.
Our SSL generalizes well across different datasets and outperforms the state-of-the-art self-supervised skeleton-based action recognition methods.
arXiv Detail & Related papers (2023-07-17T13:33:11Z) - Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness [24.83836008577395]
Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition.
They tend to optimize the adjacency matrix jointly with the model weights.
This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map.
We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
arXiv Detail & Related papers (2023-05-19T06:40:12Z) - Graph Contrastive Learning for Skeleton-based Action Recognition [85.86820157810213]
We propose a graph contrastive learning framework for skeleton-based action recognition.
SkeletonGCL associates graph learning across sequences by enforcing graphs to be class-discriminative.
SkeletonGCL establishes a new training paradigm, and it can be seamlessly incorporated into current graph convolutional networks.
arXiv Detail & Related papers (2023-01-26T02:09:16Z) - DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action
Recognition [77.87404524458809]
We propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN)
It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling.
DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.
arXiv Detail & Related papers (2022-10-12T03:17:37Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - SimMC: Simple Masked Contrastive Learning of Skeleton Representations
for Unsupervised Person Re-Identification [63.903237777588316]
We present a generic Simple Masked Contrastive learning (SimMC) framework to learn effective representations from unlabeled 3D skeletons for person re-ID.
Specifically, to fully exploit skeleton features within each skeleton sequence, we first devise a masked prototype contrastive learning (MPC) scheme.
Then, we propose the masked intra-sequence contrastive learning (MIC) to capture intra-sequence pattern consistency between subsequences.
arXiv Detail & Related papers (2022-04-21T00:19:38Z) - Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.