Related papers: KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale

KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale

URL: http://arxiv.org/abs/2409.03439v1
Date: Thu, 5 Sep 2024 11:42:08 GMT
Title: KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale
Authors: Wei Gao, Jingqiang Wang, Xinv Zhu, Jun Zhong, Yue Shen, Youshuang Ding,
Abstract summary: We would like industrial robots to handle unstructured environments with cameras and perception pipelines. Online behavior planning is required for these perception-guided industrial applications. Our DSL is mainly used by machine operators without coding experience in traditional programming languages.
Score: 6.804432396982314
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We would like industrial robots to handle unstructured environments with cameras and perception pipelines. In contrast to traditional industrial robots that replay offline-crafted trajectories, online behavior planning is required for these perception-guided industrial applications. Aside from perception and planning algorithms, deploying perception-guided manipulators also requires substantial effort in integration. One approach is writing scripts in a traditional language (such as Python) to construct the planning problem and perform integration with other algorithmic modules & external devices. While scripting in Python is feasible for a handful of robots and applications, deploying perception-guided manipulation at scale (e.g., more than 10000 robot workstations in over 2000 customer sites) becomes intractable. To resolve this challenge, we propose a Domain-Specific Language (DSL) for perception-guided manipulation applications. To scale up the deployment,our DSL provides: 1) an easily accessible interface to construct & solve a sub-class of Task and Motion Planning (TAMP) problems that are important in practical applications; and 2) a mechanism to implement flexible control flow to perform integration and address customized requirements of distinct industrial application. Combined with an intuitive graphical programming frontend, our DSL is mainly used by machine operators without coding experience in traditional programming languages. Within hours of training, operators are capable of orchestrating interesting sophisticated manipulation behaviors with our DSL. Extensive practical deployments demonstrate the efficacy of our method.

Related papers

Trajectory Adaptation using Large Language Models [0.8704964543257245]
Adapting robot trajectories based on human instructions as per new situations is essential for achieving more intuitive and scalable human-robot interactions. This work proposes a flexible language-based framework to adapt generic robotic trajectories produced by off-the-shelf motion planners. We utilize pre-trained LLMs to adapt trajectory waypoints by generating code as a policy for dense robot manipulation.
arXiv Detail & Related papers (2025-04-17T08:48:23Z)
AI-based Framework for Robust Model-Based Connector Mating in Robotic Wire Harness Installation [1.543743835720528]
We design a novel AI-based framework that automates cable connector mating by integrating force control with deep visuotactile learning. Our system optimize search-and-insertion strategies using first-order optimization over a multimodal transformer architecture trained on visual, tactile, and proprioceptive data.
arXiv Detail & Related papers (2025-03-12T13:59:26Z)
$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z)
RAMPA: Robotic Augmented Reality for Machine Programming and Automation [4.963604518596734]
This paper introduces Robotic Augmented Reality for Machine Programming (RAMPA) RAMPA is a system that utilizes the capabilities of state-of-the-art and commercially available AR headsets, e.g., Meta Quest 3. Our approach enables in-situ data recording, visualization, and fine-tuning of skill demonstrations directly within the user's physical environment.
arXiv Detail & Related papers (2024-10-17T10:21:28Z)
Octo: An Open-Source Generalist Robot Policy [88.14295917143188]
We introduce Octo, a large transformer-based policy trained on 800k trajectories from the Open X-Embodiment dataset. It can be effectively finetuned to robot setups with new sensory inputs and action spaces within a few hours on standard consumer GPU. We also perform detailed ablations of design decisions for the Octo model, from architecture to training data, to guide future research on building generalist robot models.
arXiv Detail & Related papers (2024-05-20T17:57:01Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation. We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z)
LPAC: Learnable Perception-Action-Communication Loops with Applications to Coverage Control [80.86089324742024]
We propose a learnable Perception-Action-Communication (LPAC) architecture for the problem. CNN processes localized perception; a graph neural network (GNN) facilitates robot communications. Evaluations show that the LPAC models outperform standard decentralized and centralized coverage control algorithms.
arXiv Detail & Related papers (2024-01-10T00:08:00Z)
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model [63.66204449776262]
Instruct2Act is a framework that maps multi-modal instructions to sequential actions for robotic manipulation tasks. Our approach is adjustable and flexible in accommodating various instruction modalities and input types. Our zero-shot method outperformed many state-of-the-art learning-based policies in several tasks.
arXiv Detail & Related papers (2023-05-18T17:59:49Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models [0.0]
We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task. Our Generative Skill Documentation Language (GSDL) makes code documentation compact and more expressive. An abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model.
arXiv Detail & Related papers (2022-07-20T07:27:47Z)
Manipulation of Articulated Objects using Dual-arm Robots via Answer Set Programming [10.316694915810947]
The manipulation of articulated objects is of primary importance in Robotics, and can be considered as one of the most complex manipulation tasks. Traditionally, this problem has been tackled by developing ad-hoc approaches, which lack flexibility and portability. We present a framework based on Answer Set Programming (ASP) for the automated manipulation of articulated objects in a robot control architecture.
arXiv Detail & Related papers (2020-10-02T18:50:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.