Related papers: DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics

DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics

URL: http://arxiv.org/abs/2505.24786v1
Date: Fri, 30 May 2025 16:47:44 GMT
Title: DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics
Authors: Eran Bamani Beeri, Eden Nissinman, Avishai Sintov,
Abstract summary: We introduce a novel approach designed specifically for assistive robotics, enabling dynamic gesture recognition at extended distances of up to 30 meters.<n>Our proposed Distance-aware Gesture Network (DiG-Net) effectively combines Depth-Conditioned Deformable Alignment (DADA) blocks with Spatio-Temporal Graph modules.<n>By effectively interpreting gestures from considerable distances, DiG-Net significantly enhances the usability of assistive robots in home healthcare, industrial safety, and remote assistance scenarios.
Score: 2.625826951636656
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dynamic hand gestures play a pivotal role in assistive human-robot interaction (HRI), facilitating intuitive, non-verbal communication, particularly for individuals with mobility constraints or those operating robots remotely. Current gesture recognition methods are mostly limited to short-range interactions, reducing their utility in scenarios demanding robust assistive communication from afar. In this paper, we introduce a novel approach designed specifically for assistive robotics, enabling dynamic gesture recognition at extended distances of up to 30 meters, thereby significantly improving accessibility and quality of life. Our proposed Distance-aware Gesture Network (DiG-Net) effectively combines Depth-Conditioned Deformable Alignment (DADA) blocks with Spatio-Temporal Graph modules, enabling robust processing and classification of gesture sequences captured under challenging conditions, including significant physical attenuation, reduced resolution, and dynamic gesture variations commonly experienced in real-world assistive environments. We further introduce the Radiometric Spatio-Temporal Depth Attenuation Loss (RSTDAL), shown to enhance learning and strengthen model robustness across varying distances. Our model demonstrates significant performance improvement over state-of-the-art gesture recognition frameworks, achieving a recognition accuracy of 97.3% on a diverse dataset with challenging hyper-range gestures. By effectively interpreting gestures from considerable distances, DiG-Net significantly enhances the usability of assistive robots in home healthcare, industrial safety, and remote assistance scenarios, enabling seamless and intuitive interactions for users regardless of physical limitations

Related papers

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis [51.95817740348585]
Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
arXiv Detail & Related papers (2025-08-04T06:35:48Z)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction [52.00935005918032]
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary.<n>Existing benchmarks for N-HRI fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments.<n>We introduce (Action from Robotic View) a large-scale dataset for perception-centric robotic views prevalent in mobile service robots.
arXiv Detail & Related papers (2025-07-30T09:48:34Z)
Situationally-Aware Dynamics Learning [57.698553219660376]
We propose a novel framework for online learning of hidden state representations.<n>Our approach explicitly models the influence of unobserved parameters on both transition dynamics and reward structures.<n>Experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.
arXiv Detail & Related papers (2025-05-26T06:40:11Z)
Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation [50.34179054785646]
We present Taccel, a high-performance simulation platform that integrates IPC and ABD to model robots, tactile sensors, and objects with both accuracy and unprecedented speed.<n>Taccel provides precise physics simulation and realistic tactile signals while supporting flexible robot-sensor configurations through user-friendly APIs.<n>These capabilities position Taccel as a powerful tool for scaling up tactile robotics research and development.
arXiv Detail & Related papers (2025-04-17T12:57:11Z)
Online hand gesture recognition using Continual Graph Transformers [1.3927943269211591]
We propose a novel online recognition system designed for real-time skeleton sequence streaming.<n>Our approach achieves state-of-the-art accuracy and significantly reduces false positive rates, making it a compelling solution for real-time applications.<n>The proposed system can be seamlessly integrated into various domains, including human-robot collaboration and assistive technologies.
arXiv Detail & Related papers (2025-02-20T17:27:55Z)
Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.<n>By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z)
Dynamic Gesture Recognition in Ultra-Range Distance for Effective Human-Robot Interaction [2.625826951636656]
This paper presents a novel approach for ultra-range gesture recognition, addressing Human-Robot Interaction (HRI) challenges over extended distances. By leveraging human gestures in video data, we propose the Temporal-Spatiotemporal Fusion Network (TSFN) model that surpasses the limitations of current methods. With applications in service robots, search and rescue operations, and drone-based interactions, our approach enhances HRI in expansive environments.
arXiv Detail & Related papers (2024-07-31T06:56:46Z)
Recognition of Dynamic Hand Gestures in Long Distance using a Web-Camera for Robot Guidance [2.625826951636656]
We propose a model for recognizing dynamic gestures from a long distance of up to 20 meters. The model integrates the SlowFast and Transformer architectures (SFT) to effectively process and classify complex gesture sequences captured in video frames.
arXiv Detail & Related papers (2024-06-18T09:17:28Z)
Dynamic Hand Gesture-Featured Human Motor Adaptation in Tool Delivery using Voice Recognition [5.13619372598999]
This paper introduces an innovative human-robot collaborative framework. It seamlessly integrates hand gesture and dynamic movement recognition, voice recognition, and a switchable control adaptation strategy. Experiment results have demonstrated superior performance in hand gesture recognition.
arXiv Detail & Related papers (2023-09-20T14:51:09Z)
Agile gesture recognition for capacitive sensing devices: adapting on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller. The controller generates real-time signals from each of the wearer five fingers. We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z)
Snapture -- A Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition [19.320551882950706]
We propose a novel hybrid hand gesture recognition system. Our architecture enables learning both static and dynamic gestures. Our work contributes both to gesture recognition research and machine learning applications for non-verbal communication with robots.
arXiv Detail & Related papers (2022-05-28T11:12:38Z)
Domain Adaptive Robotic Gesture Recognition with Unsupervised Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot. It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture. Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z)
Learning Obstacle Representations for Neural Motion Planning [70.80176920087136]
We address sensor-based motion planning from a learning perspective. Motivated by recent advances in visual recognition, we argue the importance of learning appropriate representations for motion planning. We propose a new obstacle representation based on the PointNet architecture and train it jointly with policies for obstacle avoidance.
arXiv Detail & Related papers (2020-08-25T17:12:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.