Intention estimation from gaze and motion features for human-robot
shared-control object manipulation
- URL: http://arxiv.org/abs/2208.08688v1
- Date: Thu, 18 Aug 2022 07:53:19 GMT
- Title: Intention estimation from gaze and motion features for human-robot
shared-control object manipulation
- Authors: Anna Belardinelli, Anirudh Reddy Kondapally, Dirk Ruiken, Daniel
Tanneberg, Tomoki Watabe
- Abstract summary: Shared control can help in teleoperated object manipulation by assisting with the execution of the user's intention.
An intention estimation framework is presented, which uses natural gaze and motion features to predict the current action and the target object.
- Score: 1.128708201885454
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Shared control can help in teleoperated object manipulation by assisting with
the execution of the user's intention. To this end, robust and prompt intention
estimation is needed, which relies on behavioral observations. Here, an
intention estimation framework is presented, which uses natural gaze and motion
features to predict the current action and the target object. The system is
trained and tested in a simulated environment with pick and place sequences
produced in a relatively cluttered scene and with both hands, with possible
hand-over to the other hand. Validation is conducted across different users and
hands, achieving good accuracy and earliness of prediction. An analysis of the
predictive power of single features shows the predominance of the grasping
trigger and the gaze features in the early identification of the current
action. In the current framework, the same probabilistic model can be used for
the two hands working in parallel and independently, while a rule-based model
is proposed to identify the resulting bimanual action. Finally, limitations and
perspectives of this approach to more complex, full-bimanual manipulations are
discussed.
Related papers
- Towards Unifying Interpretability and Control: Evaluation via Intervention [25.4582941170387]
We propose intervention as a fundamental goal of interpretability and introduce success criteria to evaluate how well methods are able to control model behavior through interventions.
We extend four popular interpretability methods--sparse autoencoders, logit lens, tuned lens, and probing--into an abstract encoder-decoder framework.
We introduce two new evaluation metrics: intervention success rate and the coherence-intervention tradeoff, designed to measure the accuracy of explanations and their utility in controlling model behavior.
arXiv Detail & Related papers (2024-11-07T04:52:18Z) - PEAR: Phrase-Based Hand-Object Interaction Anticipation [20.53329698350243]
First-person hand-object interaction anticipation aims to predict the interaction process based on current scenes and prompts.
Existing research typically anticipates only interaction intention while neglecting manipulation.
We propose a novel model, PEAR, which jointly anticipates interaction intention and manipulation.
arXiv Detail & Related papers (2024-07-31T10:28:49Z) - Value of Assistance for Grasping [6.452975320319021]
We provide a measure for assessing the expected effect a specific observation will have on the robot's ability to complete its task.
We evaluate our suggested measure in simulated and real-world collaborative grasping settings.
arXiv Detail & Related papers (2023-10-22T20:25:08Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - Control-Aware Prediction Objectives for Autonomous Driving [78.19515972466063]
We present control-aware prediction objectives (CAPOs) to evaluate the downstream effect of predictions on control without requiring the planner be differentiable.
We propose two types of importance weights that weight the predictive likelihood: one using an attention model between agents, and another based on control variation when exchanging predicted trajectories for ground truth trajectories.
arXiv Detail & Related papers (2022-04-28T07:37:21Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z) - RAIN: Reinforced Hybrid Attention Inference Network for Motion
Forecasting [34.54878390622877]
We propose a generic motion forecasting framework with dynamic key information selection and ranking based on a hybrid attention mechanism.
The framework is instantiated to handle multi-agent trajectory prediction and human motion forecasting tasks.
We validate the framework on both synthetic simulations and motion forecasting benchmarks in different domains.
arXiv Detail & Related papers (2021-08-03T06:30:30Z) - A System for Traded Control Teleoperation of Manipulation Tasks using
Intent Prediction from Hand Gestures [20.120263332724438]
This paper presents a teleoperation system that includes robot perception and intent prediction from hand gestures.
The perception module identifies the objects present in the robot workspace and the intent prediction module which object the user likely wants to grasp.
arXiv Detail & Related papers (2021-07-05T07:37:17Z) - Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation [116.07661813869196]
We propose to combine ideas from adversarial training and motion modelling to tap into unlabeled videos.
We show that an adversarial leads to better properties of the hand pose estimator via semi-supervised training on unlabeled video sequences.
The main advantage of our approach is that we can make use of unpaired videos and joint sequence data both of which are much easier to attain than paired training data.
arXiv Detail & Related papers (2021-06-10T17:50:19Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill
Primitives [89.34229413345541]
We propose a conditioning scheme which avoids pitfalls by learning the controller and its conditioning in an end-to-end manner.
Our model predicts complex action sequences based directly on a dynamic image representation of the robot motion.
We report significant improvements in task success over representative MPC and IL baselines.
arXiv Detail & Related papers (2020-03-19T15:04:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.