CNN+RNN Depth and Skeleton based Dynamic Hand Gesture Recognition
- URL: http://arxiv.org/abs/2007.11983v1
- Date: Wed, 22 Jul 2020 10:25:19 GMT
- Title: CNN+RNN Depth and Skeleton based Dynamic Hand Gesture Recognition
- Authors: Kenneth Lai and Svetlana N. Yanushkevich
- Abstract summary: We propose to combine the power of two deep learning techniques, the convolutional neural networks (CNN) and the recurrent neural networks (RNN)
An overall accuracy of 85.46% is achieved on the dynamic hand gesture-14/28 dataset.
- Score: 3.0255457622022486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human activity and gesture recognition is an important component of rapidly
growing domain of ambient intelligence, in particular in assisting living and
smart homes. In this paper, we propose to combine the power of two deep
learning techniques, the convolutional neural networks (CNN) and the recurrent
neural networks (RNN), for automated hand gesture recognition using both depth
and skeleton data. Each of these types of data can be used separately to train
neural networks to recognize hand gestures. While RNN were reported previously
to perform well in recognition of sequences of movement for each skeleton joint
given the skeleton information only, this study aims at utilizing depth data
and apply CNN to extract important spatial information from the depth images.
Together, the tandem CNN+RNN is capable of recognizing a sequence of gestures
more accurately. As well, various types of fusion are studied to combine both
the skeleton and depth information in order to extract temporal-spatial
information. An overall accuracy of 85.46% is achieved on the dynamic hand
gesture-14/28 dataset.
Related papers
- Spiking representation learning for associative memories [0.0]
We introduce a novel artificial spiking neural network (SNN) that performs unsupervised representation learning and associative memory operations.
The architecture of our model derives from the neocortical columnar organization and combines feedforward projections for learning hidden representations and recurrent projections for forming associative memories.
arXiv Detail & Related papers (2024-06-05T08:30:11Z) - CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation.
The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z) - Dynamic Gesture Recognition [0.0]
It is possible to use machine learning to classify images and/or videos instead of the traditional computer vision algorithms.
The aim of this project is to builda symbiosis between a convolutional neural network (CNN) and a recurrent neural network (RNN)
arXiv Detail & Related papers (2021-09-20T09:45:29Z) - HAN: An Efficient Hierarchical Self-Attention Network for Skeleton-Based
Gesture Recognition [73.64451471862613]
We propose an efficient hierarchical self-attention network (HAN) for skeleton-based gesture recognition.
Joint self-attention module is used to capture spatial features of fingers, the finger self-attention module is designed to aggregate features of the whole hand.
Experiments show that our method achieves competitive results on three gesture recognition datasets with much lower computational complexity.
arXiv Detail & Related papers (2021-06-25T02:15:53Z) - A Study On the Effects of Pre-processing On Spatio-temporal Action
Recognition Using Spiking Neural Networks Trained with STDP [0.0]
It is important to study the behavior of SNNs trained with unsupervised learning methods on video classification tasks.
This paper presents methods of transposing temporal information into a static format, and then transforming the visual information into spikes using latency coding.
We show the effect of the similarity in the shape and speed of certain actions on action recognition with spiking neural networks.
arXiv Detail & Related papers (2021-05-31T07:07:48Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - A Two-stream Neural Network for Pose-based Hand Gesture Recognition [23.50938160992517]
Pose based hand gesture recognition has been widely studied in the recent years.
This paper proposes a two-stream neural network with one stream being a self-attention based graph convolutional network (SAGCN)
The residual-connection enhanced Bi-IndRNN extends an IndRNN with the capability of bidirectional processing for temporal modelling.
arXiv Detail & Related papers (2021-01-22T03:22:26Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Human Activity Recognition using Multi-Head CNN followed by LSTM [1.8830374973687412]
This study presents a novel method to recognize human physical activities using CNN followed by LSTM.
By using the proposed method, we achieve state-of-the-art accuracy, which is comparable to traditional machine learning algorithms and other deep neural network algorithms.
arXiv Detail & Related papers (2020-02-21T14:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.