Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing
- URL: http://arxiv.org/abs/2508.02425v1
- Date: Mon, 04 Aug 2025 13:45:37 GMT
- Title: Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing
- Authors: Justin Hehli, Marco Heiniger, Maryam Rezayati, Hans Wernher van de Venn,
- Abstract summary: The aim of this study is to evaluate three-class human/object detection models, offering more detailed contact analysis.<n>A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis.<n>The best-performing model achieved 91.11% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11\% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.
Related papers
- A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning [67.72413262980272]
Pre-trained vision models (PVMs) are fundamental to modern robotics, yet their optimal configuration remains unclear.<n>We develop SlotMIM, a method that induces object-centric representations by introducing a semantic bottleneck.<n>Our approach achieves significant improvements over prior work in image recognition, scene understanding, and robot learning evaluations.
arXiv Detail & Related papers (2025-03-10T06:18:31Z) - Testing Human-Hand Segmentation on In-Distribution and Out-of-Distribution Data in Human-Robot Interactions Using a Deep Ensemble Model [40.815678328617686]
We present a novel approach by evaluating the performance of pre-trained deep learning models under both ID data and more challenging OOD scenarios.<n>We incorporated unique and rare conditions such as finger-crossing gestures and motion blur from fast-moving hands.<n>Results revealed that models trained on industrial datasets outperformed those trained on non-industrial datasets.
arXiv Detail & Related papers (2025-01-13T21:52:46Z) - Sample Efficient Robot Learning in Supervised Effect Prediction Tasks [0.0]
MUSEL (Model Uncertainty for Sample-Efficient Learning) is a novel AL framework tailored for regression tasks in robotics.<n>We show that MUSEL improves both learning accuracy and sample efficiency, validating its effectiveness in learning action effects selecting informative samples.
arXiv Detail & Related papers (2024-12-03T09:48:28Z) - Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation [16.809190349155525]
We propose a novel adaptation paradigm that leverages readily available paired human-robot video data to bridge the domain gap.<n>Our method employs a human-robot contrastive alignment loss to align the semantics of human and robot videos, adapting pre-trained models to the robot domain in a parameter-efficient manner.
arXiv Detail & Related papers (2024-06-20T11:57:46Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - Automatically Prepare Training Data for YOLO Using Robotic In-Hand
Observation and Synthesis [14.034128227585143]
We propose combining robotic in-hand observation and data synthesis to enlarge the limited data set collected by the robot.
The collected and synthetic images are combined to train a deep detection neural network.
The results showed that combined observation and synthetic images led to comparable performance to manual data preparation.
arXiv Detail & Related papers (2023-01-04T04:20:08Z) - BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot
Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data.
The proposed method is compared with two statistical approaches based on Universal and User-dependent models.
Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user.
At the core of our system, we employ a multi-modal deep neural network for visual grounding.
We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.