Related papers: Dynamic Gesture Recognition in Ultra-Range Distance for Effective Human-Robot Interaction

Related papers

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis [51.95817740348585]
Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
arXiv Detail & Related papers (2025-08-04T06:35:48Z)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction [52.00935005918032]
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary.<n>Existing benchmarks for N-HRI fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments.<n>We introduce (Action from Robotic View) a large-scale dataset for perception-centric robotic views prevalent in mobile service robots.
arXiv Detail & Related papers (2025-07-30T09:48:34Z)
DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics [2.625826951636656]
We introduce a novel approach designed specifically for assistive robotics, enabling dynamic gesture recognition at extended distances of up to 30 meters.<n>Our proposed Distance-aware Gesture Network (DiG-Net) effectively combines Depth-Conditioned Deformable Alignment (DADA) blocks with Spatio-Temporal Graph modules.<n>By effectively interpreting gestures from considerable distances, DiG-Net significantly enhances the usability of assistive robots in home healthcare, industrial safety, and remote assistance scenarios.
arXiv Detail & Related papers (2025-05-30T16:47:44Z)
Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models [57.45019514036948]
Simultaneous MRMP Diffusion (SMD) is a novel approach integrating constrained optimization into the diffusion sampling process to produce kinematically feasible trajectories. The paper introduces a comprehensive MRMP benchmark to evaluate trajectory planning algorithms across scenarios with varying robot densities, obstacle complexities, and motion constraints.
arXiv Detail & Related papers (2025-02-05T20:51:28Z)
Recognition of Dynamic Hand Gestures in Long Distance using a Web-Camera for Robot Guidance [2.625826951636656]
We propose a model for recognizing dynamic gestures from a long distance of up to 20 meters. The model integrates the SlowFast and Transformer architectures (SFT) to effectively process and classify complex gesture sequences captured in video frames.
arXiv Detail & Related papers (2024-06-18T09:17:28Z)
NatSGD: A Dataset with Speech, Gestures, and Demonstrations for Robot Learning in Natural Human-Robot Interaction [19.65778558341053]
Speech-gesture HRI datasets often focus on elementary tasks, like object pointing and pushing. We introduce NatSGD, a multimodal HRI dataset encompassing human commands through speech and gestures. We demonstrate its effectiveness in training robots to understand tasks through multimodal human commands.
arXiv Detail & Related papers (2024-03-04T18:02:41Z)
Ultra-Range Gesture Recognition using a Web-Camera in Human-Robot Interaction [2.240453048130742]
Vision-based methods for gesture recognition have been shown to be effective only up to a user-camera distance of seven meters. We propose a novel URGR termed Graph Vision Transformer (GViT) which takes the enhanced image as input. Evaluation of the proposed framework over diverse test data yields a high recognition rate of 98.1%.
arXiv Detail & Related papers (2023-11-26T17:27:26Z)
Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications. Traditional methods rely on hand-crafted features and machine learning techniques. We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z)
MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction [34.978017200500005]
We propose Multimodal Interactive Latent Dynamics (MILD) to address the problem of two-party physical Human-Robot Interactions (HRIs) We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE) MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory.
arXiv Detail & Related papers (2022-10-22T11:25:11Z)
Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion [69.87112582900363]
Current planners are unable to vary key gait parameters continuously while the robot is in motion. In this work we address this limitation by learning a latent space capturing the key stance phases constituting a particular gait. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, foot step height and full stance duration.
arXiv Detail & Related papers (2021-12-09T10:02:02Z)
Show Me What You Can Do: Capability Calibration on Reachable Workspace for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth. We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z)
Domain Adaptive Robotic Gesture Recognition with Unsupervised Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot. It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture. Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z)
Relational Graph Learning on Visual and Kinematics Embeddings for Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information. The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction [11.285529781751984]
We propose an attention-oriented multi-level network framework to meet the need for real-time interaction. Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution. The other compact CNN receives the extracted skeleton sequence as input for action recognition.
arXiv Detail & Related papers (2020-07-02T12:41:28Z)
Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy. Deep neural networks have been used with great success in recognizing emotions. We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.