TidyBot: Personalized Robot Assistance with Large Language Models
- URL: http://arxiv.org/abs/2305.05658v2
- Date: Wed, 11 Oct 2023 17:59:44 GMT
- Title: TidyBot: Personalized Robot Assistance with Large Language Models
- Authors: Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran
Song, Jeannette Bohg, Szymon Rusinkiewicz, Thomas Funkhouser
- Abstract summary: A key challenge is determining the proper place to put each object.
One person may prefer storing shirts in the drawer, while another may prefer them on the shelf.
We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models.
- Score: 46.629932362863386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For a robot to personalize physical assistance effectively, it must learn
user preferences that can be generally reapplied to future scenarios. In this
work, we investigate personalization of household cleanup with robots that can
tidy up rooms by picking up objects and putting them away. A key challenge is
determining the proper place to put each object, as people's preferences can
vary greatly depending on personal taste or cultural background. For instance,
one person may prefer storing shirts in the drawer, while another may prefer
them on the shelf. We aim to build systems that can learn such preferences from
just a handful of examples via prior interactions with a particular person. We
show that robots can combine language-based planning and perception with the
few-shot summarization capabilities of large language models (LLMs) to infer
generalized user preferences that are broadly applicable to future
interactions. This approach enables fast adaptation and achieves 91.2% accuracy
on unseen objects in our benchmark dataset. We also demonstrate our approach on
a real-world mobile manipulator called TidyBot, which successfully puts away
85.0% of objects in real-world test scenarios.
Related papers
- DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement [53.86523017756224]
We present DegustaBot, an algorithm for visual preference learning that solves household multi-object rearrangement tasks according to personal preference.
We collect a large dataset of naturalistic personal preferences in a simulated table-setting task.
We find that 50% of our model's predictions are likely to be found acceptable by at least 20% of people.
arXiv Detail & Related papers (2024-07-11T21:28:02Z) - Towards Generalizable Zero-Shot Manipulation via Translating Human
Interaction Plans [58.27029676638521]
We show how passive human videos can serve as a rich source of data for learning such generalist robots.
We learn a human plan predictor that, given a current image of a scene and a goal image, predicts the future hand and object configurations.
We show that our learned system can perform over 16 manipulation skills that generalize to 40 objects.
arXiv Detail & Related papers (2023-12-01T18:54:12Z) - Open-World Object Manipulation using Pre-trained Vision-Language Models [72.87306011500084]
For robots to follow instructions from people, they must be able to connect the rich semantic information in human vocabulary.
We develop a simple approach, which leverages a pre-trained vision-language model to extract object-identifying information.
In a variety of experiments on a real mobile manipulator, we find that MOO generalizes zero-shot to a wide range of novel object categories and environments.
arXiv Detail & Related papers (2023-03-02T01:55:10Z) - Learning Language-Conditioned Robot Behavior from Offline Data and
Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction.
We propose to leverage offline robot datasets with crowd-sourced natural language labels.
We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z) - Composing Pick-and-Place Tasks By Grounding Language [41.075844857146805]
We present a robot system that follows unconstrained language instructions to pick and place arbitrary objects.
Our approach infers objects and their relationships from input images and language expressions.
Results obtained using a real-world PR2 robot demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-02-16T11:29:09Z) - Learning User-Preferred Mappings for Intuitive Robot Control [28.183430654834307]
We propose a method for learning the human's preferred or preconceived mapping from a few robot queries.
We make this approach data-efficient by recognizing that human mappings have strong priors.
Our simulated and experimental results suggest that learning the mapping between inputs and robot actions improves objective and subjective performance.
arXiv Detail & Related papers (2020-07-22T18:54:35Z) - Human Grasp Classification for Reactive Human-to-Robot Handovers [50.91803283297065]
We propose an approach for human-to-robot handovers in which the robot meets the human halfway.
We collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses.
We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position.
arXiv Detail & Related papers (2020-03-12T19:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.