Related papers: Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models

Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models

URL: http://arxiv.org/abs/2207.09713v1
Date: Wed, 20 Jul 2022 07:27:47 GMT
Title: Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models
Authors: Or Wertheim (Ben-Gurion University of the Negev), Dan R. Suissa (Ben-Gurion University of the Negev), Ronen I. Brafman (Ben-Gurion University of the Negev)
Abstract summary: We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task. Our Generative Skill Documentation Language (GSDL) makes code documentation compact and more expressive. An abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To enable robots to achieve high level objectives, engineers typically write scripts that apply existing specialized skills, such as navigation, object detection and manipulation to achieve these goals. Writing good scripts is challenging since they must intelligently balance the inherent stochasticity of a physical robot's actions and sensors, and the limited information it has. In principle, AI planning can be used to address this challenge and generate good behavior policies automatically. But this requires passing three hurdles. First, the AI must understand each skill's impact on the world. Second, we must bridge the gap between the more abstract level at which we understand what a skill does and the low-level state variables used within its code. Third, much integration effort is required to tie together all components. We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task and carries four key advantages. 1) Our Generative Skill Documentation Language (GSDL) makes code documentation simpler, compact, and more expressive using ideas from probabilistic programming languages. 2) An expressive abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model. 3) Any properly documented skill can be used by the controller without any additional programming effort, providing a Plug'n Play experience. 4) A POMDP solver schedules skill execution while properly balancing partial observability, stochastic behavior, and noisy sensing.

Related papers

Investigating the Effectiveness of a Socratic Chain-of-Thoughts Reasoning Method for Task Planning in Robotics, A Case Study [0.0]
We investigate whether large language models (LLMs) are capable of navigating complex spatial tasks with physical actions in the real world. We apply GPT-4( Omni) with a simulated Tiago robot in Webots engine for an object search task. Preliminary results show that when combined with chain-of-thought reasoning, the Socratic method can be used for code generation for robotic tasks that require spatial awareness.
arXiv Detail & Related papers (2025-03-11T08:36:37Z)
$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z)
Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data. Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability. We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z)
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks [50.27313829438866]
Plan-Seq-Learn (PSL) is a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control. PSL achieves success rates of over 85%, out-performing language-based, classical, and end-to-end approaches.
arXiv Detail & Related papers (2024-05-02T17:59:31Z)
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis [102.1876259853457]
We propose a tree-structured multimodal code generation framework for generalized robotic behavior synthesis, termed RoboCodeX. RoboCodeX decomposes high-level human instructions into multiple object-centric manipulation units consisting of physical preferences such as affordance and safety constraints. To further enhance the capability to map conceptual and perceptual understanding into control commands, a specialized multimodal reasoning dataset is collected for pre-training and an iterative self-updating methodology is introduced for supervised fine-tuning.
arXiv Detail & Related papers (2024-02-25T15:31:43Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation. We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z)
Using Knowledge Representation and Task Planning for Robot-agnostic Skills on the Example of Contact-Rich Wiping Tasks [44.99833362998488]
We show how a single robot skill that utilizes knowledge representation, task planning, and automatic selection of skill implementations can be executed in different contexts. We demonstrate how the skill-based control platform enables this with contact-rich wiping tasks on different robot systems.
arXiv Detail & Related papers (2023-08-27T21:17:32Z)
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? [54.442692221567796]
Task specification is critical for engagement of non-expert end-users and adoption of personalized robots. A widely studied approach to task specification is through goals, using either compact state vectors or goal images from the same robot scene. In this work, we explore alternate and more general forms of goal specification that are expected to be easier for humans to specify and use.
arXiv Detail & Related papers (2022-04-23T19:39:49Z)
Learning and Sequencing of Object-Centric Manipulation Skills for Industrial Tasks [16.308562047398542]
We propose a rapid robot skill-sequencing algorithm, where the skills are encoded by object-centric hidden semi-Markov models. The learned skill models can encode multimodal (temporal and spatial) trajectory distributions. We demonstrate this approach on a 7 DoF robot arm for industrial assembly tasks.
arXiv Detail & Related papers (2020-08-24T14:20:05Z)
Enabling human-like task identification from natural conversation [7.00597813134145]
We provide a non-trivial method to combine an NLP engine and a planner such that a robot can successfully identify tasks and all the relevant parameters and generate an accurate plan for the task. This work makes a significant stride towards enabling a human-like task understanding capability in a robot.
arXiv Detail & Related papers (2020-08-23T17:19:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.