Skeleton-Contrastive 3D Action Representation Learning
- URL: http://arxiv.org/abs/2108.03656v1
- Date: Sun, 8 Aug 2021 14:44:59 GMT
- Title: Skeleton-Contrastive 3D Action Representation Learning
- Authors: Fida Mohammad Thoker, Hazel Doughty, Cees G.M. Snoek
- Abstract summary: This paper strives for self-supervised learning of a feature space suitable for skeleton-based action recognition.
Our approach achieves state-of-the-art performance for self-supervised learning from skeleton data on the challenging PKU and NTU datasets.
- Score: 35.06361753065124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper strives for self-supervised learning of a feature space suitable
for skeleton-based action recognition. Our proposal is built upon learning
invariances to input skeleton representations and various skeleton
augmentations via a noise contrastive estimation. In particular, we propose
inter-skeleton contrastive learning, which learns from multiple different input
skeleton representations in a cross-contrastive manner. In addition, we
contribute several skeleton-specific spatial and temporal augmentations which
further encourage the model to learn the spatio-temporal dynamics of skeleton
data. By learning similarities between different skeleton representations as
well as augmented views of the same sequence, the network is encouraged to
learn higher-level semantics of the skeleton data than when only using the
augmented views. Our approach achieves state-of-the-art performance for
self-supervised learning from skeleton data on the challenging PKU and NTU
datasets with multiple downstream tasks, including action recognition, action
retrieval and semi-supervised learning. Code is available at
https://github.com/fmthoker/skeleton-contrast.
Related papers
- Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning [20.34477942813382]
Skeleton-based action representation learning aims to interpret and understand human behaviors by encoding the skeleton sequences.
We introduce a novel skeleton-based training framework based on Cross-modal Contrastive learning.
Our method outperforms the previous methods and achieves state-of-the-art results.
arXiv Detail & Related papers (2024-05-31T03:40:15Z) - SkeleTR: Towrads Skeleton-based Action Recognition in the Wild [86.03082891242698]
SkeleTR is a new framework for skeleton-based action recognition.
It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions.
It then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.
arXiv Detail & Related papers (2023-09-20T16:22:33Z) - SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence
Pre-training [110.55093254677638]
We propose an efficient skeleton sequence learning framework, named Skeleton Sequence Learning (SSL)
In this paper, we build an asymmetric graph-based encoder-decoder pre-training architecture named SkeletonMAE.
Our SSL generalizes well across different datasets and outperforms the state-of-the-art self-supervised skeleton-based action recognition methods.
arXiv Detail & Related papers (2023-07-17T13:33:11Z) - One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton
Matching [77.6989219290789]
One-shot skeleton action recognition aims to learn a skeleton action recognition model with a single training sample.
This paper presents a novel one-shot skeleton action recognition technique that handles skeleton action recognition via multi-scale spatial-temporal feature matching.
arXiv Detail & Related papers (2023-07-14T11:52:10Z) - Contrastive Self-Supervised Learning for Skeleton Representations [2.528877542605869]
We use a contrastive self-supervised learning method, SimCLR, to learn representations that capture the semantics of skeleton point clouds.
To pre-train the representations, we normalise six existing datasets to obtain more than 40 million skeleton frames.
We evaluate the quality of the learned representations with three downstream tasks: skeleton reconstruction, motion prediction, and activity classification.
arXiv Detail & Related papers (2022-11-10T02:45:36Z) - Skeleton Prototype Contrastive Learning with Multi-Level Graph Relation
Modeling for Unsupervised Person Re-Identification [63.903237777588316]
Person re-identification (re-ID) via 3D skeletons is an important emerging topic with many merits.
Existing solutions rarely explore valuable body-component relations in skeletal structure or motion.
This paper proposes a generic unsupervised Prototype Contrastive learning paradigm with Multi-level Graph Relation learning.
arXiv Detail & Related papers (2022-08-25T00:59:32Z) - Contrastive Learning from Spatio-Temporal Mixed Skeleton Sequences for
Self-Supervised Skeleton-Based Action Recognition [21.546894064451898]
We show that directly extending contrastive pairs based on normal augmentations brings limited returns in terms of performance.
We propose SkeleMixCLR: a contrastive learning framework with atemporal skeleton mixing augmentation (SkeleMix) to complement current contrastive learning approaches.
arXiv Detail & Related papers (2022-07-07T03:18:09Z) - SimMC: Simple Masked Contrastive Learning of Skeleton Representations
for Unsupervised Person Re-Identification [63.903237777588316]
We present a generic Simple Masked Contrastive learning (SimMC) framework to learn effective representations from unlabeled 3D skeletons for person re-ID.
Specifically, to fully exploit skeleton features within each skeleton sequence, we first devise a masked prototype contrastive learning (MPC) scheme.
Then, we propose the masked intra-sequence contrastive learning (MIC) to capture intra-sequence pattern consistency between subsequences.
arXiv Detail & Related papers (2022-04-21T00:19:38Z) - A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D
Skeleton Based Person Re-Identification [65.18004601366066]
Person re-identification (Re-ID) via gait features within 3D skeleton sequences is a newly-emerging topic with several advantages.
This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID.
arXiv Detail & Related papers (2020-09-05T16:06:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.