A Non-Anatomical Graph Structure for isolated hand gesture separation in
continuous gesture sequences
- URL: http://arxiv.org/abs/2207.07619v1
- Date: Fri, 15 Jul 2022 17:28:52 GMT
- Title: A Non-Anatomical Graph Structure for isolated hand gesture separation in
continuous gesture sequences
- Authors: Razieh Rastgoo, Kourosh Kiani, and Sergio Escalera
- Abstract summary: We propose a GCN model and combine it with the stacked Bi-LSTM and Attention modules to push the temporal information in the video stream.
Considering the breakthroughs of GCN models for skeleton modality, we propose a two-layer GCN model to empower the 3D hand skeleton features.
- Score: 42.20687552354674
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Continuous Hand Gesture Recognition (CHGR) has been extensively studied by
researchers in the last few decades. Recently, one model has been presented to
deal with the challenge of the boundary detection of isolated gestures in a
continuous gesture video [17]. To enhance the model performance and also
replace the handcrafted feature extractor in the presented model in [17], we
propose a GCN model and combine it with the stacked Bi-LSTM and Attention
modules to push the temporal information in the video stream. Considering the
breakthroughs of GCN models for skeleton modality, we propose a two-layer GCN
model to empower the 3D hand skeleton features. Finally, the class
probabilities of each isolated gesture are fed to the post-processing module,
borrowed from [17]. Furthermore, we replace the anatomical graph structure with
some non-anatomical graph structures. Due to the lack of a large dataset,
including both the continuous gesture sequences and the corresponding isolated
gestures, three public datasets in Dynamic Hand Gesture Recognition (DHGR),
RKS-PERSIANSIGN, and ASLVID, are used for evaluation. Experimental results show
the superiority of the proposed model in dealing with isolated gesture
boundaries detection in continuous gesture sequences
Related papers
- Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection [7.127829790714167]
Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision.
This paper introduces a novel, practical and lightweight framework, namely Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (GiCiSAD)
experiments on four widely used skeleton-based video datasets show that GiCiSAD outperforms existing methods with significantly fewer training parameters.
arXiv Detail & Related papers (2024-03-18T18:42:32Z) - 3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by
Prior Knowledge for Hand-Object Interaction Scenario [8.364378460776832]
We propose a 3D hand reconstruction network combining the benefits of model-based and model-free approaches to balance accuracy and physical plausibility for hand-object interaction scenario.
Firstly, we present a novel MANO pose parameters regression module from 2D joints directly, which avoids the process of highly nonlinear mapping from abstract image feature.
arXiv Detail & Related papers (2024-03-04T05:11:26Z) - Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion
Modeling [83.76377808476039]
We propose a new modeling method for human pose deformations and design an accompanying diffusion-based motion prior.
Inspired by the field of non-rigid structure-from-motion, we divide the task of reconstructing 3D human skeletons in motion into the estimation of a 3D reference skeleton.
A mixed spatial-temporal NRSfMformer is used to simultaneously estimate the 3D reference skeleton and the skeleton deformation of each frame from 2D observations sequence.
arXiv Detail & Related papers (2023-08-18T16:41:57Z) - Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand
Disentanglement [42.98335775548796]
We introduce a novel bilateral hand disentanglement based two-stage 3D hand generation method.
In the first stage, we intend to generate natural hand gestures by two hand-disentanglement branches.
The second stage is built upon the insight that 3D hand predictions should be non-deterministic.
arXiv Detail & Related papers (2023-03-03T08:08:04Z) - DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action
Recognition [77.87404524458809]
We propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN)
It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling.
DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.
arXiv Detail & Related papers (2022-10-12T03:17:37Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - HAN: An Efficient Hierarchical Self-Attention Network for Skeleton-Based
Gesture Recognition [73.64451471862613]
We propose an efficient hierarchical self-attention network (HAN) for skeleton-based gesture recognition.
Joint self-attention module is used to capture spatial features of fingers, the finger self-attention module is designed to aggregate features of the whole hand.
Experiments show that our method achieves competitive results on three gesture recognition datasets with much lower computational complexity.
arXiv Detail & Related papers (2021-06-25T02:15:53Z) - Sequential convolutional network for behavioral pattern extraction in
gait recognition [0.7874708385247353]
We propose a sequential convolutional network (SCN) to learn the walking pattern of individuals.
In SCN, behavioral information extractors (BIE) are constructed to comprehend intermediate feature maps in time series.
A multi-frame aggregator in SCN performs feature integration on a sequence whose length is uncertain, via a mobile 3D convolutional layer.
arXiv Detail & Related papers (2021-04-23T08:44:10Z) - Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D.
By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.