Heterogeneous Skeleton-Based Action Representation Learning
- URL: http://arxiv.org/abs/2506.03481v1
- Date: Wed, 04 Jun 2025 01:39:13 GMT
- Title: Heterogeneous Skeleton-Based Action Representation Learning
- Authors: Hongsong Wang, Xiaoyan Ma, Jidong Kuang, Jie Gui,
- Abstract summary: This work addresses the challenge of heterogeneous skeleton-based action representation learning.<n>The proposed framework comprises two primary components: heterogeneous skeleton processing and unified representation learning.<n>Our approach can be applied to action recognition in robots with different humanoid structures.
- Score: 12.690541950946358
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Skeleton-based human action recognition has received widespread attention in recent years due to its diverse range of application scenarios. Due to the different sources of human skeletons, skeleton data naturally exhibit heterogeneity. The previous works, however, overlook the heterogeneity of human skeletons and solely construct models tailored for homogeneous skeletons. This work addresses the challenge of heterogeneous skeleton-based action representation learning, specifically focusing on processing skeleton data that varies in joint dimensions and topological structures. The proposed framework comprises two primary components: heterogeneous skeleton processing and unified representation learning. The former first converts two-dimensional skeleton data into three-dimensional skeleton via an auxiliary network, and then constructs a prompted unified skeleton using skeleton-specific prompts. We also design an additional modality named semantic motion encoding to harness the semantic information within skeletons. The latter module learns a unified action representation using a shared backbone network that processes different heterogeneous skeletons. Extensive experiments on the NTU-60, NTU-120, and PKU-MMD II datasets demonstrate the effectiveness of our method in various tasks of action understanding. Our approach can be applied to action recognition in robots with different humanoid structures.
Related papers
- Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning for Skeleton-Based Person Re-Identification [60.939250172443586]
Person re-identification (re-ID) via 3D skeleton data is a challenging task with significant value in many scenarios.<n>Existing skeleton-based methods typically assume virtual motion relations between all joints, and adopt average joint or sequence representations for learning.<n>This paper presents a generic Motif guided graph transformer with Combinatorial skeleton prototype learning (MoCos)<n>MoCos exploits structure-specific and gait-related body relations as well as features of skeleton graphs to learn effective skeleton representations for person re-ID.
arXiv Detail & Related papers (2024-12-12T08:13:29Z) - Hierarchical Skeleton Meta-Prototype Contrastive Learning with Hard
Skeleton Mining for Unsupervised Person Re-Identification [70.90142717649785]
This paper proposes a generic unsupervised Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) approach with Hard Skeleton Mining (HSM) for person re-ID with unlabeled 3D skeletons.
By converting original prototypes into meta-prototypes with multiple homogeneous transformations, we induce the model to learn the inherent consistency of prototypes to capture more effective skeleton features for person re-ID.
arXiv Detail & Related papers (2023-07-24T16:18:22Z) - SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence
Pre-training [110.55093254677638]
We propose an efficient skeleton sequence learning framework, named Skeleton Sequence Learning (SSL)
In this paper, we build an asymmetric graph-based encoder-decoder pre-training architecture named SkeletonMAE.
Our SSL generalizes well across different datasets and outperforms the state-of-the-art self-supervised skeleton-based action recognition methods.
arXiv Detail & Related papers (2023-07-17T13:33:11Z) - Skeleton Prototype Contrastive Learning with Multi-Level Graph Relation
Modeling for Unsupervised Person Re-Identification [63.903237777588316]
Person re-identification (re-ID) via 3D skeletons is an important emerging topic with many merits.
Existing solutions rarely explore valuable body-component relations in skeletal structure or motion.
This paper proposes a generic unsupervised Prototype Contrastive learning paradigm with Multi-level Graph Relation learning.
arXiv Detail & Related papers (2022-08-25T00:59:32Z) - Skeleton-Contrastive 3D Action Representation Learning [35.06361753065124]
This paper strives for self-supervised learning of a feature space suitable for skeleton-based action recognition.
Our approach achieves state-of-the-art performance for self-supervised learning from skeleton data on the challenging PKU and NTU datasets.
arXiv Detail & Related papers (2021-08-08T14:44:59Z) - Skeleton-Aware Networks for Deep Motion Retargeting [83.65593033474384]
We introduce a novel deep learning framework for data-driven motion between skeletons.
Our approach learns how to retarget without requiring any explicit pairing between the motions in the training set.
arXiv Detail & Related papers (2020-05-12T12:51:40Z) - What and Where: Modeling Skeletons from Semantic and Spatial
Perspectives for Action Recognition [46.836815779215456]
We propose to model skeletons from a novel spatial perspective, from which the model takes the spatial location as prior knowledge to group human joints.
From the semantic perspective, we propose a Transformer-like network that is expert in modeling joint correlations.
From the spatial perspective, we transform the skeleton data into the sparse format for efficient feature extraction.
arXiv Detail & Related papers (2020-04-07T10:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.