A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
- URL: http://arxiv.org/abs/2002.05907v1
- Date: Fri, 14 Feb 2020 08:12:12 GMT
- Title: A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
- Authors: Bin Ren, Mengyuan Liu, Runwei Ding, Hong Liu
- Abstract summary: 3D skeleton-based action recognition, owing to the latent advantages of skeleton, has been an active topic in computer vision.
This survey firstly highlight the necessity of action recognition and the significance of 3D-skeleton data.
Then a comprehensive introduction about Recurrent Neural Network(RNN)-based, Convolutional Neural Network(CNN)-based and Graph Convolutional Network(GCN)-based main stream action recognition techniques are illustrated.
- Score: 20.865811389226234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D skeleton-based action recognition, owing to the latent advantages of
skeleton, has been an active topic in computer vision. As a consequence, there
are lots of impressive works including conventional handcraft feature based and
learned feature based have been done over the years. However, previous surveys
about action recognition mostly focus on the video or RGB data dominated
methods, and the scanty existing reviews related to skeleton data mainly
indicate the representation of skeleton data or performance of some classic
techniques on a certain dataset. Besides, though deep learning methods has been
applied to this field for years, there is no related reserach concern about an
introduction or review from the perspective of deep learning architectures. To
break those limitations, this survey firstly highlight the necessity of action
recognition and the significance of 3D-skeleton data. Then a comprehensive
introduction about Recurrent Neural Network(RNN)-based, Convolutional Neural
Network(CNN)-based and Graph Convolutional Network(GCN)-based main stream
action recognition techniques are illustrated in a data-driven manner. Finally,
we give a brief talk about the biggest 3D skeleton dataset NTU-RGB+D and its
new edition called NTU-RGB+D 120, accompanied with several existing top rank
algorithms within those two datasets. To our best knowledge, this is the first
research which give an overall discussion over deep learning-based action
recognitin using 3D skeleton data.
Related papers
- Improving Video Violence Recognition with Human Interaction Learning on
3D Skeleton Point Clouds [88.87985219999764]
We develop a method for video violence recognition from a new perspective of skeleton points.
We first formulate 3D skeleton point clouds from human sequences extracted from videos.
We then perform interaction learning on these 3D skeleton point clouds.
arXiv Detail & Related papers (2023-08-26T12:55:18Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based
Action Recognition [49.163326827954656]
We propose a novel multi-granular-temporal graph network for skeleton-based action classification.
We develop a dual-head graph network consisting of two inter-leaved branches, which enables us to extract at least two-temporal resolutions.
We conduct extensive experiments on three large-scale datasets.
arXiv Detail & Related papers (2021-08-10T09:25:07Z) - UNIK: A Unified Framework for Real-world Skeleton-based Action
Recognition [11.81043814295441]
We introduce UNIK, a novel skeleton-based action recognition method that is able to generalize across datasets.
To study the cross-domain generalizability of action recognition in real-world videos, we re-evaluate state-of-the-art approaches as well as the proposed UNIK.
Results show that the proposed UNIK, with pre-training on Posetics, generalizes well and outperforms state-of-the-art when transferred onto four target action classification datasets.
arXiv Detail & Related papers (2021-07-19T02:00:28Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - KShapeNet: Riemannian network on Kendall shape space for Skeleton based
Action Recognition [7.183483982542308]
We propose a geometry aware deep learning approach for skeleton-based action recognition.
Skeletons are first modeled as trajectories on Kendall's shape space and then mapped to the linear tangent space.
The resulting structured data are then fed to a deep learning architecture, which includes a layer that optimize over rigid and non rigid transformations.
arXiv Detail & Related papers (2020-11-24T10:14:07Z) - Unifying Graph Embedding Features with Graph Convolutional Networks for
Skeleton-based Action Recognition [18.001693718043292]
We propose a novel framework, which unifies 15 graph embedding features into the graph convolutional network for human action recognition.
Our model is validated by three large-scale datasets, namely NTU-RGB+D, Kinetics and SYSU-3D.
arXiv Detail & Related papers (2020-03-06T02:31:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.