Self-Supervised 3D Action Representation Learning with Skeleton Cloud
Colorization
- URL: http://arxiv.org/abs/2304.08799v3
- Date: Mon, 16 Oct 2023 08:41:20 GMT
- Title: Self-Supervised 3D Action Representation Learning with Skeleton Cloud
Colorization
- Authors: Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Yongjian Hu, Alex C.
Kot
- Abstract summary: 3D Skeleton-based human action recognition has attracted increasing attention in recent years.
Most of the existing work focuses on supervised learning which requires a large number of labeled action sequences.
In this paper, we address self-supervised 3D action representation learning for skeleton-based action recognition.
- Score: 75.0912240667375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Skeleton-based human action recognition has attracted increasing attention
in recent years. Most of the existing work focuses on supervised learning which
requires a large number of labeled action sequences that are often expensive
and time-consuming to annotate. In this paper, we address self-supervised 3D
action representation learning for skeleton-based action recognition. We
investigate self-supervised representation learning and design a novel skeleton
cloud colorization technique that is capable of learning spatial and temporal
skeleton representations from unlabeled skeleton sequence data. We represent a
skeleton action sequence as a 3D skeleton cloud and colorize each point in the
cloud according to its temporal and spatial orders in the original
(unannotated) skeleton sequence. Leveraging the colorized skeleton point cloud,
we design an auto-encoder framework that can learn spatial-temporal features
from the artificial color labels of skeleton joints effectively. Specifically,
we design a two-steam pretraining network that leverages fine-grained and
coarse-grained colorization to learn multi-scale spatial-temporal features. In
addition, we design a Masked Skeleton Cloud Repainting task that can pretrain
the designed auto-encoder framework to learn informative representations. We
evaluate our skeleton cloud colorization approach with linear classifiers
trained under different configurations, including unsupervised,
semi-supervised, fully-supervised, and transfer learning settings. Extensive
experiments on NTU RGB+D, NTU RGB+D 120, PKU-MMD, NW-UCLA, and UWA3D datasets
show that the proposed method outperforms existing unsupervised and
semi-supervised 3D action recognition methods by large margins and achieves
competitive performance in supervised 3D action recognition as well.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Skeleton2vec: A Self-supervised Learning Framework with Contextualized
Target Representations for Skeleton Sequence [56.092059713922744]
We show that using high-level contextualized features as prediction targets can achieve superior performance.
Specifically, we propose Skeleton2vec, a simple and efficient self-supervised 3D action representation learning framework.
Our proposed Skeleton2vec outperforms previous methods and achieves state-of-the-art results.
arXiv Detail & Related papers (2024-01-01T12:08:35Z) - Contrastive Self-Supervised Learning for Skeleton Representations [2.528877542605869]
We use a contrastive self-supervised learning method, SimCLR, to learn representations that capture the semantics of skeleton point clouds.
To pre-train the representations, we normalise six existing datasets to obtain more than 40 million skeleton frames.
We evaluate the quality of the learned representations with three downstream tasks: skeleton reconstruction, motion prediction, and activity classification.
arXiv Detail & Related papers (2022-11-10T02:45:36Z) - Spatio-temporal Self-Supervised Representation Learning for 3D Point
Clouds [96.9027094562957]
We introduce a-temporal representation learning framework, capable of learning from unlabeled tasks.
Inspired by how infants learn from visual data in the wild, we explore rich cues derived from the 3D data.
STRL takes two temporally-related frames from a 3D point cloud sequence as the input, transforms it with the spatial data augmentation, and learns the invariant representation self-supervisedly.
arXiv Detail & Related papers (2021-09-01T04:17:11Z) - Skeleton-Contrastive 3D Action Representation Learning [35.06361753065124]
This paper strives for self-supervised learning of a feature space suitable for skeleton-based action recognition.
Our approach achieves state-of-the-art performance for self-supervised learning from skeleton data on the challenging PKU and NTU datasets.
arXiv Detail & Related papers (2021-08-08T14:44:59Z) - Skeleton Cloud Colorization for Unsupervised 3D Action Representation
Learning [65.88887113157627]
Skeleton-based human action recognition has attracted increasing attention in recent years.
We design a novel skeleton cloud colorization technique that is capable of learning skeleton representations from unlabeled skeleton sequence data.
We show that the proposed method outperforms existing unsupervised and semi-supervised 3D action recognition methods by large margins.
arXiv Detail & Related papers (2021-08-04T10:55:39Z) - A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D
Skeleton Based Person Re-Identification [65.18004601366066]
Person re-identification (Re-ID) via gait features within 3D skeleton sequences is a newly-emerging topic with several advantages.
This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID.
arXiv Detail & Related papers (2020-09-05T16:06:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.