Skeleton-based Human Action Recognition via Convolutional Neural
Networks (CNN)
- URL: http://arxiv.org/abs/2301.13360v1
- Date: Tue, 31 Jan 2023 01:26:17 GMT
- Title: Skeleton-based Human Action Recognition via Convolutional Neural
Networks (CNN)
- Authors: Ayman Ali, Ekkasit Pinyoanuntapong, Pu Wang, Mohsen Dorodchi
- Abstract summary: Most state-of-the-art contributions in skeleton-based action recognition incorporate a Graph Neural Network (GCN) architecture for representing the human body and extracting features.
Our research demonstrates that Convolutional Neural Networks (CNNs) can attain comparable results to GCN, provided that the proper training techniques, augmentations, and augmentations are applied.
- Score: 4.598337780022892
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, there has been a remarkable increase in the interest towards
skeleton-based action recognition within the research community, owing to its
various advantageous features, including computational efficiency,
representative features, and illumination invariance. Despite this, researchers
continue to explore and investigate the most optimal way to represent human
actions through skeleton representation and the extracted features. As a
result, the growth and availability of human action recognition datasets have
risen substantially. In addition, deep learning-based algorithms have gained
widespread popularity due to the remarkable advancements in various computer
vision tasks. Most state-of-the-art contributions in skeleton-based action
recognition incorporate a Graph Neural Network (GCN) architecture for
representing the human body and extracting features. Our research demonstrates
that Convolutional Neural Networks (CNNs) can attain comparable results to GCN,
provided that the proper training techniques, augmentations, and optimizers are
applied. Our approach has been rigorously validated, and we have achieved a
score of 95% on the NTU-60 dataset
Related papers
- AutoGCN -- Towards Generic Human Activity Recognition with Neural
Architecture Search [0.16385815610837165]
This paper introduces AutoGCN, a generic Neural Architecture Search (NAS) algorithm for Human Activity Recognition (HAR) using Graph Convolution Networks (GCNs)
We conduct extensive experiments on two large-scale datasets focused on skeleton-based action recognition to assess the proposed algorithm's performance.
arXiv Detail & Related papers (2024-02-02T11:07:27Z) - Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z) - Neural Architecture Search Using Genetic Algorithm for Facial Expression
Recognition [2.7504274245107303]
We propose a genetic algorithm that uses an ingenious encoding-decoding mechanism that allows to automatically evolve CNNs on FER tasks.
The proposed algorithm achieves the best-known results on the CK+ and FERG datasets as well as competitive results on the JAFFE dataset.
arXiv Detail & Related papers (2023-04-12T16:36:07Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based
Action Recognition [49.163326827954656]
We propose a novel multi-granular-temporal graph network for skeleton-based action classification.
We develop a dual-head graph network consisting of two inter-leaved branches, which enables us to extract at least two-temporal resolutions.
We conduct extensive experiments on three large-scale datasets.
arXiv Detail & Related papers (2021-08-10T09:25:07Z) - UNIK: A Unified Framework for Real-world Skeleton-based Action
Recognition [11.81043814295441]
We introduce UNIK, a novel skeleton-based action recognition method that is able to generalize across datasets.
To study the cross-domain generalizability of action recognition in real-world videos, we re-evaluate state-of-the-art approaches as well as the proposed UNIK.
Results show that the proposed UNIK, with pre-training on Posetics, generalizes well and outperforms state-of-the-art when transferred onto four target action classification datasets.
arXiv Detail & Related papers (2021-07-19T02:00:28Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Progressive Spatio-Temporal Graph Convolutional Network for
Skeleton-Based Human Action Recognition [97.14064057840089]
We propose a method to automatically find a compact and problem-specific network for graph convolutional networks in a progressive manner.
Experimental results on two datasets for skeleton-based human action recognition indicate that the proposed method has competitive or even better classification performance.
arXiv Detail & Related papers (2020-11-11T09:57:49Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Unifying Graph Embedding Features with Graph Convolutional Networks for
Skeleton-based Action Recognition [18.001693718043292]
We propose a novel framework, which unifies 15 graph embedding features into the graph convolutional network for human action recognition.
Our model is validated by three large-scale datasets, namely NTU-RGB+D, Kinetics and SYSU-3D.
arXiv Detail & Related papers (2020-03-06T02:31:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.