Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and
Generative Models
- URL: http://arxiv.org/abs/2207.09713v1
- Date: Wed, 20 Jul 2022 07:27:47 GMT
- Title: Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and
Generative Models
- Authors: Or Wertheim (Ben-Gurion University of the Negev), Dan R. Suissa
(Ben-Gurion University of the Negev), Ronen I. Brafman (Ben-Gurion University
of the Negev)
- Abstract summary: We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task.
Our Generative Skill Documentation Language (GSDL) makes code documentation compact and more expressive.
An abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To enable robots to achieve high level objectives, engineers typically write
scripts that apply existing specialized skills, such as navigation, object
detection and manipulation to achieve these goals. Writing good scripts is
challenging since they must intelligently balance the inherent stochasticity of
a physical robot's actions and sensors, and the limited information it has. In
principle, AI planning can be used to address this challenge and generate good
behavior policies automatically. But this requires passing three hurdles.
First, the AI must understand each skill's impact on the world. Second, we must
bridge the gap between the more abstract level at which we understand what a
skill does and the low-level state variables used within its code. Third, much
integration effort is required to tie together all components. We describe an
approach for integrating robot skills into a working autonomous robot
controller that schedules its skills to achieve a specified task and carries
four key advantages. 1) Our Generative Skill Documentation Language (GSDL)
makes code documentation simpler, compact, and more expressive using ideas from
probabilistic programming languages. 2) An expressive abstraction mapping (AM)
bridges the gap between low-level robot code and the abstract AI planning
model. 3) Any properly documented skill can be used by the controller without
any additional programming effort, providing a Plug'n Play experience. 4) A
POMDP solver schedules skill execution while properly balancing partial
observability, stochastic behavior, and noisy sensing.
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.
Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.
We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks [50.27313829438866]
Plan-Seq-Learn (PSL) is a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control.
PSL achieves success rates of over 85%, out-performing language-based, classical, and end-to-end approaches.
arXiv Detail & Related papers (2024-05-02T17:59:31Z) - RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis [102.1876259853457]
We propose a tree-structured multimodal code generation framework for generalized robotic behavior synthesis, termed RoboCodeX.
RoboCodeX decomposes high-level human instructions into multiple object-centric manipulation units consisting of physical preferences such as affordance and safety constraints.
To further enhance the capability to map conceptual and perceptual understanding into control commands, a specialized multimodal reasoning dataset is collected for pre-training and an iterative self-updating methodology is introduced for supervised fine-tuning.
arXiv Detail & Related papers (2024-02-25T15:31:43Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Using Knowledge Representation and Task Planning for Robot-agnostic
Skills on the Example of Contact-Rich Wiping Tasks [44.99833362998488]
We show how a single robot skill that utilizes knowledge representation, task planning, and automatic selection of skill implementations can be executed in different contexts.
We demonstrate how the skill-based control platform enables this with contact-rich wiping tasks on different robot systems.
arXiv Detail & Related papers (2023-08-27T21:17:32Z) - Can Foundation Models Perform Zero-Shot Task Specification For Robot
Manipulation? [54.442692221567796]
Task specification is critical for engagement of non-expert end-users and adoption of personalized robots.
A widely studied approach to task specification is through goals, using either compact state vectors or goal images from the same robot scene.
In this work, we explore alternate and more general forms of goal specification that are expected to be easier for humans to specify and use.
arXiv Detail & Related papers (2022-04-23T19:39:49Z) - Learning and Sequencing of Object-Centric Manipulation Skills for
Industrial Tasks [16.308562047398542]
We propose a rapid robot skill-sequencing algorithm, where the skills are encoded by object-centric hidden semi-Markov models.
The learned skill models can encode multimodal (temporal and spatial) trajectory distributions.
We demonstrate this approach on a 7 DoF robot arm for industrial assembly tasks.
arXiv Detail & Related papers (2020-08-24T14:20:05Z) - Enabling human-like task identification from natural conversation [7.00597813134145]
We provide a non-trivial method to combine an NLP engine and a planner such that a robot can successfully identify tasks and all the relevant parameters and generate an accurate plan for the task.
This work makes a significant stride towards enabling a human-like task understanding capability in a robot.
arXiv Detail & Related papers (2020-08-23T17:19:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.