KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale
- URL: http://arxiv.org/abs/2409.03439v1
- Date: Thu, 5 Sep 2024 11:42:08 GMT
- Title: KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale
- Authors: Wei Gao, Jingqiang Wang, Xinv Zhu, Jun Zhong, Yue Shen, Youshuang Ding,
- Abstract summary: We would like industrial robots to handle unstructured environments with cameras and perception pipelines.
Online behavior planning is required for these perception-guided industrial applications.
Our DSL is mainly used by machine operators without coding experience in traditional programming languages.
- Score: 6.804432396982314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We would like industrial robots to handle unstructured environments with cameras and perception pipelines. In contrast to traditional industrial robots that replay offline-crafted trajectories, online behavior planning is required for these perception-guided industrial applications. Aside from perception and planning algorithms, deploying perception-guided manipulators also requires substantial effort in integration. One approach is writing scripts in a traditional language (such as Python) to construct the planning problem and perform integration with other algorithmic modules & external devices. While scripting in Python is feasible for a handful of robots and applications, deploying perception-guided manipulation at scale (e.g., more than 10000 robot workstations in over 2000 customer sites) becomes intractable. To resolve this challenge, we propose a Domain-Specific Language (DSL) for perception-guided manipulation applications. To scale up the deployment,our DSL provides: 1) an easily accessible interface to construct & solve a sub-class of Task and Motion Planning (TAMP) problems that are important in practical applications; and 2) a mechanism to implement flexible control flow to perform integration and address customized requirements of distinct industrial application. Combined with an intuitive graphical programming frontend, our DSL is mainly used by machine operators without coding experience in traditional programming languages. Within hours of training, operators are capable of orchestrating interesting sophisticated manipulation behaviors with our DSL. Extensive practical deployments demonstrate the efficacy of our method.
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Grounding Robot Policies with Visuomotor Language Guidance [15.774237279917594]
We propose an agent-based framework for grounding robot policies to the current context.
The proposed framework is composed of a set of conversational agents designed for specific roles.
We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates.
arXiv Detail & Related papers (2024-10-09T02:00:37Z) - Octo: An Open-Source Generalist Robot Policy [88.14295917143188]
We introduce Octo, a large transformer-based policy trained on 800k trajectories from the Open X-Embodiment dataset.
It can be effectively finetuned to robot setups with new sensory inputs and action spaces within a few hours on standard consumer GPU.
We also perform detailed ablations of design decisions for the Octo model, from architecture to training data, to guide future research on building generalist robot models.
arXiv Detail & Related papers (2024-05-20T17:57:01Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - LPAC: Learnable Perception-Action-Communication Loops with Applications
to Coverage Control [80.86089324742024]
We propose a learnable Perception-Action-Communication (LPAC) architecture for the problem.
CNN processes localized perception; a graph neural network (GNN) facilitates robot communications.
Evaluations show that the LPAC models outperform standard decentralized and centralized coverage control algorithms.
arXiv Detail & Related papers (2024-01-10T00:08:00Z) - Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions
with Large Language Model [63.66204449776262]
Instruct2Act is a framework that maps multi-modal instructions to sequential actions for robotic manipulation tasks.
Our approach is adjustable and flexible in accommodating various instruction modalities and input types.
Our zero-shot method outperformed many state-of-the-art learning-based policies in several tasks.
arXiv Detail & Related papers (2023-05-18T17:59:49Z) - ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z) - Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and
Generative Models [0.0]
We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task.
Our Generative Skill Documentation Language (GSDL) makes code documentation compact and more expressive.
An abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model.
arXiv Detail & Related papers (2022-07-20T07:27:47Z) - Manipulation of Articulated Objects using Dual-arm Robots via Answer Set
Programming [10.316694915810947]
The manipulation of articulated objects is of primary importance in Robotics, and can be considered as one of the most complex manipulation tasks.
Traditionally, this problem has been tackled by developing ad-hoc approaches, which lack flexibility and portability.
We present a framework based on Answer Set Programming (ASP) for the automated manipulation of articulated objects in a robot control architecture.
arXiv Detail & Related papers (2020-10-02T18:50:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.