3D Skeleton-Based Action Recognition: A Review
- URL: http://arxiv.org/abs/2506.00915v1
- Date: Sun, 01 Jun 2025 09:04:12 GMT
- Title: 3D Skeleton-Based Action Recognition: A Review
- Authors: Mengyuan Liu, Hong Liu, Qianshuo Hu, Bin Ren, Junsong Yuan, Jiaying Lin, Jiajun Wen,
- Abstract summary: 3D skeleton-based action recognition has become a prominent topic in the field of computer vision.<n>Previous reviews have predominantly adopted a model-oriented perspective, often neglecting the fundamental steps involved in skeleton-based action recognition.<n>This review aims to address these limitations by presenting a comprehensive, task-oriented framework for understanding skeleton-based action recognition.
- Score: 60.0580120274659
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the inherent advantages of skeleton representation, 3D skeleton-based action recognition has become a prominent topic in the field of computer vision. However, previous reviews have predominantly adopted a model-oriented perspective, often neglecting the fundamental steps involved in skeleton-based action recognition. This oversight tends to ignore key components of skeleton-based action recognition beyond model design and has hindered deeper, more intrinsic understanding of the task. To bridge this gap, our review aims to address these limitations by presenting a comprehensive, task-oriented framework for understanding skeleton-based action recognition. We begin by decomposing the task into a series of sub-tasks, placing particular emphasis on preprocessing steps such as modality derivation and data augmentation. The subsequent discussion delves into critical sub-tasks, including feature extraction and spatio-temporal modeling techniques. Beyond foundational action recognition networks, recently advanced frameworks such as hybrid architectures, Mamba models, large language models (LLMs), and generative models have also been highlighted. Finally, a comprehensive overview of public 3D skeleton datasets is presented, accompanied by an analysis of state-of-the-art algorithms evaluated on these benchmarks. By integrating task-oriented discussions, comprehensive examinations of sub-tasks, and an emphasis on the latest advancements, our review provides a fundamental and accessible structured roadmap for understanding and advancing the field of 3D skeleton-based action recognition.
Related papers
- Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation [50.81551581148339]
We introduce Relevant Reasoning (R$2$S), a reasoning-based segmentation framework.<n>We also introduce 3D ReasonSeg, a reasoning-based segmentation dataset.<n>Both experiments demonstrate that the R$2$S and 3D ReasonSeg effectively endow 3D point cloud perception with stronger spatial reasoning capabilities.
arXiv Detail & Related papers (2025-06-29T06:58:08Z) - Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics [31.819336585007104]
We propose to leverage superquadrics as an alternative 3D object representation to bounding boxes.<n>We demonstrate their effectiveness on both template-free object reconstruction and action recognition tasks.<n>We also study the compositionality of actions by considering a more challenging task where the training combinations of verbs and nouns do not overlap with the testing split.
arXiv Detail & Related papers (2025-01-13T07:26:05Z) - On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey [82.49623756124357]
Zero-shot image recognition (ZSIR) aims to recognize and reason in unseen domains by learning generalized knowledge from limited data.<n>This paper thoroughly investigates recent advances in element-wise ZSIR and provides a basis for its future development.
arXiv Detail & Related papers (2024-08-09T05:49:21Z) - Self-Supervised Skeleton-Based Action Representation Learning: A Benchmark and Beyond [19.074841631219233]
Self-supervised learning (SSL) has been proven effective for skeleton-based action understanding.
In this paper, we conduct a comprehensive survey on self-supervised skeleton-based action representation learning.
arXiv Detail & Related papers (2024-06-05T06:21:54Z) - Recognizing Identities From Human Skeletons: A Survey on 3D Skeleton Based Person Re-Identification [60.939250172443586]
Person re-identification via 3D skeletons is an important emerging research area that attracts increasing attention within the pattern recognition community.<n>We provide a comprehensive review and analysis of recent SRID advances.<n>A thorough evaluation of state-of-the-art SRID methods is conducted over various types of benchmarks and protocols to compare their effectiveness and efficiency.
arXiv Detail & Related papers (2024-01-27T04:52:24Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - ANUBIS: Review and Benchmark Skeleton-Based Action Recognition Methods
with a New Dataset [26.581495230711198]
We present a review in the form of a taxonomy on existing works of skeleton-based action recognition.
To promote more fair and comprehensive evaluation, we collect ANUBIS, a large-scale human skeleton dataset.
arXiv Detail & Related papers (2022-05-04T14:03:43Z) - Revisiting spatio-temporal layouts for compositional action recognition [63.04778884595353]
We take an object-centric approach to action recognition.
The main focus of this paper is compositional/few-shot action recognition.
We demonstrate how to improve the performance of appearance-based models by fusion with layout-based models.
arXiv Detail & Related papers (2021-11-02T23:04:39Z) - A Survey on 3D Skeleton-Based Action Recognition Using Learning Method [20.865811389226234]
3D skeleton-based action recognition, owing to the latent advantages of skeleton, has been an active topic in computer vision.
This survey firstly highlight the necessity of action recognition and the significance of 3D-skeleton data.
Then a comprehensive introduction about Recurrent Neural Network(RNN)-based, Convolutional Neural Network(CNN)-based and Graph Convolutional Network(GCN)-based main stream action recognition techniques are illustrated.
arXiv Detail & Related papers (2020-02-14T08:12:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.