Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared
Control on the Hannes Prosthesis
- URL: http://arxiv.org/abs/2203.09812v1
- Date: Fri, 18 Mar 2022 09:16:48 GMT
- Title: Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared
Control on the Hannes Prosthesis
- Authors: Federico Vasile, Elisa Maiettini, Giulia Pasquale, Astrid Florio,
Nicol\`o Boccardo, Lorenzo Natale
- Abstract summary: We present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences.
We tackle the peculiarity of the eye-in-hand setting by means of a model for the human arm trajectories.
- Score: 6.517935794312337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the task of object grasping with a prosthetic hand capable of
multiple grasp types. In this setting, communicating the intended grasp type
often requires a high user cognitive load which can be reduced adopting shared
autonomy frameworks. Among these, so-called eye-in-hand systems automatically
control the hand aperture and pre-shaping before the grasp, based on visual
input coming from a camera on the wrist. In this work, we present an
eye-in-hand learning-based approach for hand pre-shape classification from RGB
sequences. In order to reduce the need for tedious data collection sessions for
training the system, we devise a pipeline for rendering synthetic visual
sequences of hand trajectories for the purpose. We tackle the peculiarity of
the eye-in-hand setting by means of a model for the human arm trajectories,
with domain randomization over relevant visual elements. We develop a
sensorized setup to acquire real human grasping sequences for benchmarking and
show that, compared on practical use cases, models trained with our synthetic
dataset achieve better generalization performance than models trained on real
data. We finally integrate our model on the Hannes prosthetic hand and show its
practical effectiveness. Our code, real and synthetic datasets will be released
upon acceptance.
Related papers
- What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Fast and Expressive Gesture Recognition using a Combination-Homomorphic
Electromyogram Encoder [21.25126610043744]
We study the task of gesture recognition from electromyography (EMG)
We define combination gestures consisting of a direction component and a modifier component.
New subjects only demonstrate the single component gestures.
We extrapolate to unseen combination gestures by combining the feature vectors of real single gestures to produce synthetic training data.
arXiv Detail & Related papers (2023-10-30T20:03:34Z) - Self-supervised Optimization of Hand Pose Estimation using Anatomical
Features and Iterative Learning [4.698846136465861]
This paper presents a self-supervised pipeline for adapting hand pose estimation to specific use cases with minimal human interaction.
The pipeline consists of a general machine learning model for hand pose estimation trained on a generalized dataset.
The effectiveness of the pipeline is demonstrated by training an activity recognition as a downstream task in the manual assembly scenario.
arXiv Detail & Related papers (2023-07-06T14:13:11Z) - Procedural Humans for Computer Vision [1.9550079119934403]
We build a parametric model of the face and body, including articulated hands, to generate realistic images of humans based on this body model.
We show that this can be extended to include the full body by building on the pipeline of Wood et al. to generate synthetic images of humans in their entirety.
arXiv Detail & Related papers (2023-01-03T15:44:48Z) - SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning.
This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z) - Tracking and Reconstructing Hand Object Interactions from Point Cloud
Sequences in the Wild [35.55753131098285]
We propose a point cloud based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion.
Our pipeline then reconstructs the full hand via converting the predicted hand joints into a template-based parametric hand model MANO.
For object tracking, we devise a simple yet effective module that estimates the object SDF from the first frame and performs optimization-based tracking.
arXiv Detail & Related papers (2022-09-24T13:40:09Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - From Hand-Perspective Visual Information to Grasp Type Probabilities:
Deep Learning via Ranking Labels [6.772076545800592]
We build a novel probabilistic classifier according to the Plackett-Luce model to predict the probability distribution over grasps.
We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.
arXiv Detail & Related papers (2021-03-08T16:12:38Z) - Learning to Segment Human Body Parts with Synthetically Trained Deep
Convolutional Networks [58.0240970093372]
This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data.
The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts.
arXiv Detail & Related papers (2021-02-02T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.