A Distributed Multi-Modal Sensing Approach for Human Activity Recognition in Real-Time Human-Robot Collaboration
- URL: http://arxiv.org/abs/2602.07024v1
- Date: Mon, 02 Feb 2026 10:14:19 GMT
- Title: A Distributed Multi-Modal Sensing Approach for Human Activity Recognition in Real-Time Human-Robot Collaboration
- Authors: Valerio Belcamino, Nhat Minh Dinh Le, Quan Khanh Luu, Alessandro Carfì, Van Anh Ho, Fulvio Mastrogiovanni,
- Abstract summary: This paper introduces a HAR system combining a modular data glove equipped with Inertial Measurement Units and a vision-based tactile sensor to capture hand activities in contact with a robot.<n>We tested our activity recognition approach under different conditions, including offline classification of segmented sequences, real-time classification under static conditions, and a realistic HRC scenario.<n>The experimental results show a high accuracy for all the tasks, suggesting that multiple collaborative settings could benefit from this multi-modal approach.
- Score: 41.43425233041408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human activity recognition (HAR) is fundamental in human-robot collaboration (HRC), enabling robots to respond to and dynamically adapt to human intentions. This paper introduces a HAR system combining a modular data glove equipped with Inertial Measurement Units and a vision-based tactile sensor to capture hand activities in contact with a robot. We tested our activity recognition approach under different conditions, including offline classification of segmented sequences, real-time classification under static conditions, and a realistic HRC scenario. The experimental results show a high accuracy for all the tasks, suggesting that multiple collaborative settings could benefit from this multi-modal approach.
Related papers
- Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis [51.95817740348585]
Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
arXiv Detail & Related papers (2025-08-04T06:35:48Z) - Recognizing Actions from Robotic View for Natural Human-Robot Interaction [52.00935005918032]
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary.<n>Existing benchmarks for N-HRI fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments.<n>We introduce (Action from Robotic View) a large-scale dataset for perception-centric robotic views prevalent in mobile service robots.
arXiv Detail & Related papers (2025-07-30T09:48:34Z) - A Comparative Study of Human Activity Recognition: Motion, Tactile, and multi-modal Approaches [43.97520291340696]
This study evaluates the ability of a vision-based tactile sensor to classify 15 activities.<n>We propose a multi-modal framework combining tactile and motion data to leverage their complementary strengths.
arXiv Detail & Related papers (2025-05-13T15:20:21Z) - Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations [19.184155232662995]
We propose a novel approach for learning a shared latent space representation for Human-Robot Interaction (HRI)
We train a Variational Autoencoder (VAE) to learn robot motions regularized using an informative latent space prior.
We find that our approach of using an informative MDN prior from human observations for a VAE generates more accurate robot motions.
arXiv Detail & Related papers (2024-07-10T13:16:12Z) - Learning Multimodal Latent Dynamics for Human-Robot Interaction [18.68936554172693]
This article presents a method for learning well-coordinated Human-Robot Interaction (HRI) from Human-Human Interactions (HHI)<n>We devise a hybrid approach using Hidden Markov Models (HMMs) as the latent space priors for a Variational Autoencoder to model a joint distribution over the interacting agents.<n>We find that users perceive our method as more human-like, timely, and accurate and rank our method with a higher degree of preference over other baselines.
arXiv Detail & Related papers (2023-11-27T23:56:59Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Robust Activity Recognition for Adaptive Worker-Robot Interaction using
Transfer Learning [0.0]
This paper proposes a transfer learning methodology for activity recognition of construction workers.
The developed algorithm transfers features from a model pre-trained by the original authors and fine-tunes them for the downstream task of activity recognition.
Results indicate that the fine-tuned model can recognize distinct MMH tasks in a robust and adaptive manner.
arXiv Detail & Related papers (2023-08-28T19:03:46Z) - MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot
Interaction [34.978017200500005]
We propose Multimodal Interactive Latent Dynamics (MILD) to address the problem of two-party physical Human-Robot Interactions (HRIs)
We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE)
MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory.
arXiv Detail & Related papers (2022-10-22T11:25:11Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.