Related papers: A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network

A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network

URL: http://arxiv.org/abs/2202.12519v1
Date: Fri, 25 Feb 2022 06:46:58 GMT
Title: A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network
Authors: Abir Sen, Tapas Kumar Mishra, Ratnakar Dash
Abstract summary: Detection of hand portion has become a challenging task in computer vision and pattern recognition communities. Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks. In this paper, an ensemble of CNN-based approaches is presented to overcome some problems like high variance during prediction, overfitting problem and also prediction errors.
Score: 3.5665681694253903
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nowadays, hand gesture recognition has become an alternative for human-machine interaction. It has covered a large area of applications like 3D game technology, sign language interpreting, VR (virtual reality) environment, and robotics. But detection of the hand portion has become a challenging task in computer vision and pattern recognition communities. Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks, but CNN architectures suffer from some problems like high variance during prediction, overfitting problem and also prediction errors. To overcome these problems, an ensemble of CNN-based approaches is presented in this paper. Firstly, the gesture portion is detected by using the background separation method based on binary thresholding. After that, the contour portion is extracted, and the hand region is segmented. Then, the images have been resized and fed into three individual CNN models to train them in parallel. In the last part, the output scores of CNN models are averaged to construct an optimal ensemble model for the final prediction. Two publicly available datasets (labeled as Dataset-1 and Dataset-2) containing infrared images and one self-constructed dataset have been used to validate the proposed system. Experimental results are compared with the existing state-of-the-art approaches, and it is observed that our proposed ensemble model outperforms other existing proposed methods.

Related papers

An evaluation of CNN models and data augmentation techniques in hierarchical localization of mobile robots [0.0]
This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot. In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented. A variety of data augmentation visual effects are proposed for addressing the visual localization of the robot.
arXiv Detail & Related papers (2024-07-15T10:20:00Z)
A Domain Decomposition-Based CNN-DNN Architecture for Model Parallel Training Applied to Image Recognition Problems [0.0]
A novel CNN-DNN architecture is proposed that naturally supports a model parallel training strategy. The proposed approach can significantly accelerate the required training time compared to the global model. Results show that the proposed approach can also help to improve the accuracy of the underlying classification problem.
arXiv Detail & Related papers (2023-02-13T18:06:59Z)
Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition. Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models. Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z)
Lost Vibration Test Data Recovery Using Convolutional Neural Network: A Case Study [0.0]
This paper proposes a CNN algorithm for the Alamosa Canyon Bridge as a real structure. Three different CNN models were considered to predict one and two malfunctioned sensors. The accuracy of the model was increased by adding a convolutional layer.
arXiv Detail & Related papers (2022-04-11T23:24:03Z)
Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras. Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement. In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z)
Multi-Branch Deep Radial Basis Function Networks for Facial Emotion Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units. RBF units capture local patterns shared by similar instances using an intermediate representation. We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z)
An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks. We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem. We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z)
Application of Facial Recognition using Convolutional Neural Networks for Entry Access Control [0.0]
The paper focuses on solving the supervised classification problem of taking images of people as input and classifying the person in the image as one of the authors or not. Two approaches are proposed: (1) building and training a neural network called WoodNet from scratch and (2) leveraging transfer learning by utilizing a network pre-trained on the ImageNet database. The results are two models classifying the individuals in the dataset with high accuracy, achieving over 99% accuracy on held-out test data.
arXiv Detail & Related papers (2020-11-23T07:55:24Z)
AutoSpeech: Neural Architecture Search for Speaker Recognition [108.69505815793028]
We propose the first neural architecture search approach approach for the speaker recognition tasks, named as AutoSpeech. Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times. Results demonstrate that the derived CNN architectures significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 back-bones, while enjoying lower model complexity.
arXiv Detail & Related papers (2020-05-07T02:53:47Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Inferring Convolutional Neural Networks' accuracies from their architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance. We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems. We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.