Learning from Physical Human Feedback: An Object-Centric One-Shot
Adaptation Method
- URL: http://arxiv.org/abs/2203.04951v2
- Date: Fri, 2 Jun 2023 09:37:20 GMT
- Title: Learning from Physical Human Feedback: An Object-Centric One-Shot
Adaptation Method
- Authors: Alvin Shek, Bo Ying Su, Rui Chen and Changliu Liu
- Abstract summary: Object Preference Adaptation (OPA) is composed of two key stages: 1) pre-training a base policy to produce a wide variety of behaviors, and 2) online-updating according to human feedback.
Our adaptation occurs online, requires only one human intervention (one-shot), and produces new behaviors never seen during training.
trained on cheap synthetic data instead of expensive human demonstrations, our policy correctly adapts to human perturbations on realistic tasks on a physical 7DOF robot.
- Score: 5.906020149230538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For robots to be effectively deployed in novel environments and tasks, they
must be able to understand the feedback expressed by humans during
intervention. This can either correct undesirable behavior or indicate
additional preferences. Existing methods either require repeated episodes of
interactions or assume prior known reward features, which is data-inefficient
and can hardly transfer to new tasks. We relax these assumptions by describing
human tasks in terms of object-centric sub-tasks and interpreting physical
interventions in relation to specific objects. Our method, Object Preference
Adaptation (OPA), is composed of two key stages: 1) pre-training a base policy
to produce a wide variety of behaviors, and 2) online-updating according to
human feedback. The key to our fast, yet simple adaptation is that general
interaction dynamics between agents and objects are fixed, and only
object-specific preferences are updated. Our adaptation occurs online, requires
only one human intervention (one-shot), and produces new behaviors never seen
during training. Trained on cheap synthetic data instead of expensive human
demonstrations, our policy correctly adapts to human perturbations on realistic
tasks on a physical 7DOF robot. Videos, code, and supplementary material are
provided.
Related papers
- Promptable Behaviors: Personalizing Multi-Objective Rewards from Human
Preferences [53.353022588751585]
We present Promptable Behaviors, a novel framework that facilitates efficient personalization of robotic agents to diverse human preferences.
We introduce three distinct methods to infer human preferences by leveraging different types of interactions.
We evaluate the proposed method in personalized object-goal navigation and flee navigation tasks in ProcTHOR and RoboTHOR.
arXiv Detail & Related papers (2023-12-14T21:00:56Z) - ThinkBot: Embodied Instruction Following with Thought Chain Reasoning [66.09880459084901]
Embodied Instruction Following (EIF) requires agents to complete human instruction by interacting objects in complicated surrounding environments.
We propose ThinkBot that reasons the thought chain in human instruction to recover the missing action descriptions.
Our ThinkBot outperforms the state-of-the-art EIF methods by a sizable margin in both success rate and execution efficiency.
arXiv Detail & Related papers (2023-12-12T08:30:09Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Physically Plausible Full-Body Hand-Object Interaction Synthesis [32.83908152822006]
We propose a physics-based method for synthesizing dexterous hand-object interactions in a full-body setting.
Existing methods often focus on isolated segments of the interaction process and rely on data-driven techniques that may result in artifacts.
arXiv Detail & Related papers (2023-09-14T17:55:18Z) - InterDiff: Generating 3D Human-Object Interactions with Physics-Informed
Diffusion [29.25063155767897]
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs)
Our task is significantly more challenging, as it requires modeling dynamic objects with various shapes, capturing whole-body motion, and ensuring physically valid interactions.
Experiments on multiple human-object interaction datasets demonstrate the effectiveness of our method for this task, capable of producing realistic, vivid, and remarkably long-term 3D HOI predictions.
arXiv Detail & Related papers (2023-08-31T17:59:08Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations [61.659439423703155]
TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations.
Our method generates continuous motions that are parameterized only by the temporal coordinate.
This work takes a step further toward general human-scene interaction simulation.
arXiv Detail & Related papers (2023-03-23T09:31:56Z) - Synthesizing Physical Character-Scene Interactions [64.26035523518846]
It is necessary to synthesize such interactions between virtual characters and their surroundings.
We present a system that uses adversarial imitation learning and reinforcement learning to train physically-simulated characters.
Our approach takes physics-based character motion generation a step closer to broad applicability.
arXiv Detail & Related papers (2023-02-02T05:21:32Z) - Improving Personality Consistency in Conversation by Persona Extending [22.124187337032946]
We propose a novel retrieval-to-prediction paradigm consisting of two subcomponents, namely, Persona Retrieval Model (PRM) and Posterior-scored Transformer (PS-Transformer)
Our proposed model yields considerable improvements in both automatic metrics and human evaluations.
arXiv Detail & Related papers (2022-08-23T09:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.