Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions
- URL: http://arxiv.org/abs/2507.04465v1
- Date: Sun, 06 Jul 2025 17:03:01 GMT
- Title: Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions
- Authors: Konstantinos Foteinos, Jorgen Cani, Manousos Linardakis, Panagiotis Radoglou-Grammatikis, Vasileios Argyriou, Panagiotis Sarigiannidis, Iraklis Varlamis, Georgios Th. Papadopoulos,
- Abstract summary: Vision-based hand gesture recognition (VHGR) delivers a wide range of applications, such as sign language understanding and human-computer interaction using cameras.<n>Despite the large volume of research works in the field, a structured and complete survey on VHGR is still missing.<n>This review aims to constitute a useful guideline for researchers, helping them to choose the right strategy for delving into a certain VHGR task.
- Score: 5.983872847786255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid evolution of deep learning (DL) models and the ever-increasing size of available datasets have raised the interest of the research community in the always important field of vision-based hand gesture recognition (VHGR), and delivered a wide range of applications, such as sign language understanding and human-computer interaction using cameras. Despite the large volume of research works in the field, a structured and complete survey on VHGR is still missing, leaving researchers to navigate through hundreds of papers in order to find the right combination of data, model, and approach for each task. The current survey aims to fill this gap by presenting a comprehensive overview of this aspect of computer vision. With a systematic research methodology that identifies the state-of-the-art works and a structured presentation of the various methods, datasets, and evaluation metrics, this review aims to constitute a useful guideline for researchers, helping them to choose the right strategy for delving into a certain VHGR task. Starting with the methodology used for study selection, literature retrieval, and the analytical framing, the survey identifies and organizes key VHGR approaches using a taxonomy-based format in various dimensions such as input modality and application domain. The core of the survey provides an in-depth analysis of state-of-the-art techniques across three primary VHGR tasks: static gesture recognition, isolated dynamic gestures and continuous gesture recognition. For each task, the architectural trends and learning strategies are listed. Additionally, the study reviews commonly used datasets - emphasizing on annotation schemes - and evaluates standard performance metrics. It concludes by identifying major challenges in VHGR, including both general computer vision issues and domain-specific obstacles, and outlines promising directions for future research.
Related papers
- Object Recognition Datasets and Challenges: A Review [5.638005500131518]
We provide a detailed analysis of datasets in the highly investigated object recognition areas.<n>We present an overview of the prominent object recognition benchmarks and competitions.<n>All introduced datasets and challenges can be found online at.com/AbtinDjavadifar/ORDC.
arXiv Detail & Related papers (2025-07-30T03:56:37Z) - Personalized Generation In Large Model Era: A Survey [90.7579254803302]
In the era of large models, content generation is gradually shifting to Personalized Generation (PGen)<n>This paper presents the first comprehensive survey on PGen, investigating existing research in this rapidly growing field.<n>By bridging PGen research across multiple modalities, this survey serves as a valuable resource for fostering knowledge sharing and interdisciplinary collaboration.
arXiv Detail & Related papers (2025-03-04T13:34:19Z) - G-OSR: A Comprehensive Benchmark for Graph Open-Set Recognition [54.45837774534411]
We introduce textbfG-OSR, a benchmark for evaluating Graph Open-Set Recognition (GOSR) methods at both the node and graph levels.<n>Results offer critical insights into the generalizability and limitations of current GOSR methods.
arXiv Detail & Related papers (2025-03-01T13:02:47Z) - Survey on Hand Gesture Recognition from Visual Input [2.1591725778863555]
Hand gesture recognition has become an important research area, driven by the growing demand for human-computer interaction.<n>There are few surveys that comprehensively cover recent research developments, available solutions, and benchmark datasets.<n>This survey addresses this gap by examining the latest advancements in hand gesture and 3D hand pose recognition from various types of camera input data.
arXiv Detail & Related papers (2025-01-21T09:23:22Z) - A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities [1.6144710323800757]
Hand Gesture Recognition (HGR) systems enhance natural, efficient, and authentic human-computer interaction.
Despite significant progress, automatic and precise identification of hand gestures remains a considerable challenge in computer vision.
This paper provides a comprehensive review of HGR techniques and data modalities from 2014 to 2024, exploring advancements in sensor technology and computer vision.
arXiv Detail & Related papers (2024-08-10T04:40:01Z) - From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models [98.41645229835493]
Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making.<n>Large foundation models, such as large language models, have revolutionized various natural language processing tasks.<n>This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis.
arXiv Detail & Related papers (2024-03-18T17:57:09Z) - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future [6.4105103117533755]
A taxonomy is first developed to organize different tasks and methodologies.
The proposed taxonomy is universal across different tasks, covering object detection, semantic/instance/panoptic segmentation, 3D and video understanding.
arXiv Detail & Related papers (2023-07-18T12:52:49Z) - Scene Graph Generation: A Comprehensive Survey [35.80909746226258]
Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding.
Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph.
We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG.
arXiv Detail & Related papers (2022-01-03T00:55:33Z) - Fine-Grained Image Analysis with Deep Learning: A Survey [146.22351342315233]
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition.
This paper attempts to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained image recognition and fine-grained image retrieval.
arXiv Detail & Related papers (2021-11-11T09:43:56Z) - Deep Gait Recognition: A Survey [15.47582611826366]
Gait recognition is an appealing biometric modality which aims to identify individuals based on the way they walk.
Deep learning has reshaped the research landscape in this area since 2015 through the ability to automatically learn discriminative representations.
We present a comprehensive overview of breakthroughs and recent developments in gait recognition with deep learning.
arXiv Detail & Related papers (2021-02-18T18:49:28Z) - A Survey on Heterogeneous Graph Embedding: Methods, Techniques,
Applications and Sources [79.48829365560788]
Heterogeneous graphs (HGs) also known as heterogeneous information networks have become ubiquitous in real-world scenarios.
HG embedding aims to learn representations in a lower-dimension space while preserving the heterogeneous structures and semantics for downstream tasks.
arXiv Detail & Related papers (2020-11-30T15:03:47Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.