Novel Human Machine Interface via Robust Hand Gesture Recognition System using Channel Pruned YOLOv5s Model
- URL: http://arxiv.org/abs/2407.02585v1
- Date: Tue, 2 Jul 2024 18:10:20 GMT
- Title: Novel Human Machine Interface via Robust Hand Gesture Recognition System using Channel Pruned YOLOv5s Model
- Authors: Abir Sen, Tapas Kumar Mishra, Ratnakar Dash,
- Abstract summary: This paper develops an efficient hand gesture detection and classification model using a channel-pruned YOLOv5s model.
Our proposed method paves the way for deploying a pruned YOLOv5s model for a real-time gesture-command-based HCI.
The average detection speed of our proposed system has reached more than 60 frames per second (fps) in real-time.
- Score: 4.0194015554916644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hand gesture recognition (HGR) is a vital component in enhancing the human-computer interaction experience, particularly in multimedia applications, such as virtual reality, gaming, smart home automation systems, etc. Users can control and navigate through these applications seamlessly by accurately detecting and recognizing gestures. However, in a real-time scenario, the performance of the gesture recognition system is sometimes affected due to the presence of complex background, low-light illumination, occlusion problems, etc. Another issue is building a fast and robust gesture-controlled human-computer interface (HCI) in the real-time scenario. The overall objective of this paper is to develop an efficient hand gesture detection and classification model using a channel-pruned YOLOv5-small model and utilize the model to build a gesture-controlled HCI with a quick response time (in ms) and higher detection speed (in fps). First, the YOLOv5s model is chosen for the gesture detection task. Next, the model is simplified by using a channel-pruned algorithm. After that, the pruned model is further fine-tuned to ensure detection efficiency. We have compared our suggested scheme with other state-of-the-art works, and it is observed that our model has shown superior results in terms of mAP (mean average precision), precision (\%), recall (\%), and F1-score (\%), fast inference time (in ms), and detection speed (in fps). Our proposed method paves the way for deploying a pruned YOLOv5s model for a real-time gesture-command-based HCI to control some applications, such as the VLC media player, Spotify player, etc., using correctly classified gesture commands in real-time scenarios. The average detection speed of our proposed system has reached more than 60 frames per second (fps) in real-time, which meets the perfect requirement in real-time application control.
Related papers
- Graspness Discovery in Clutters for Fast and Accurate Grasp Detection [57.81325062171676]
"graspness" is a quality based on geometry cues that distinguishes graspable areas in cluttered scenes.
We develop a neural network named cascaded graspness model to approximate the searching process.
Experiments on a large-scale benchmark, GraspNet-1Billion, show that our method outperforms previous arts by a large margin.
arXiv Detail & Related papers (2024-06-17T02:06:47Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - Hand gesture recognition using 802.11ad mmWave sensor in the mobile
device [2.5476515662939563]
We explore the feasibility of AI assisted hand-gesture recognition using 802.11ad 60GHz (mmWave) technology in smartphones.
We built a prototype system, where radar sensing and communication waveform can coexist by time-division duplex (TDD)
It can gather sensing data and predict gestures within 100 milliseconds.
arXiv Detail & Related papers (2022-11-14T03:36:17Z) - StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception.
We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy.
Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z) - Design of Human Machine Interface through vision-based low-cost Hand
Gesture Recognition system based on deep CNN [3.5665681694253903]
A real-time hand gesture recognition system-based human-computer interface (HCI) is presented.
The system consists of six stages: hand detection, (2) gesture segmentation, (3) use of six pre-trained CNN models by using the transfer-learning method, (4) building an interactive human-machine interface, (5) development of a gesture-controlled virtual mouse.
arXiv Detail & Related papers (2022-07-07T06:50:08Z) - Cross-modal Learning of Graph Representations using Radar Point Cloud
for Long-Range Gesture Recognition [6.9545038359818445]
We propose a novel architecture for a long-range (1m - 2m) gesture recognition solution.
We use a point cloud-based cross-learning approach from camera point cloud to 60-GHz FMCW radar point cloud.
In the experimental results section, we demonstrate our model's overall accuracy of 98.4% for five gestures and its generalization capability.
arXiv Detail & Related papers (2022-03-31T14:34:36Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - Slow-Fast Visual Tempo Learning for Video-based Action Recognition [78.3820439082979]
Action visual tempo characterizes the dynamics and the temporal scale of an action.
Previous methods capture the visual tempo either by sampling raw videos with multiple rates, or by hierarchically sampling backbone features.
We propose a Temporal Correlation Module (TCM) to extract action visual tempo from low-level backbone features at single-layer remarkably.
arXiv Detail & Related papers (2022-02-24T14:20:04Z) - Towards Domain-Independent and Real-Time Gesture Recognition Using
mmWave Signal [11.76969975145963]
DI-Gesture is a domain-independent and real-time mmWave gesture recognition system.
In real-time scenario, the accuracy of DI-Gesutre reaches over 97% with average inference time of 2.87ms.
arXiv Detail & Related papers (2021-11-11T13:28:28Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.