MuTT: A Multimodal Trajectory Transformer for Robot Skills
- URL: http://arxiv.org/abs/2407.15660v2
- Date: Thu, 22 Aug 2024 09:12:19 GMT
- Title: MuTT: A Multimodal Trajectory Transformer for Robot Skills
- Authors: Claudius Kienle, Benjamin Alt, Onur Celik, Philipp Becker, Darko Katic, Rainer Jäkel, Gerhard Neumann,
- Abstract summary: MuTT is a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills.
We pioneer the fusion of vision and trajectory, introducing a novel trajectory projection.
This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions.
- Score: 14.84252843639553
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: High-level robot skills represent an increasingly popular paradigm in robot programming. However, configuring the skills' parameters for a specific task remains a manual and time-consuming endeavor. Existing approaches for learning or optimizing these parameters often require numerous real-world executions or do not work in dynamic environments. To address these challenges, we propose MuTT, a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills by integrating vision, trajectory, and robot skill parameters. Notably, we pioneer the fusion of vision and trajectory, introducing a novel trajectory projection. Furthermore, we illustrate MuTT's efficacy as a predictor when combined with a model-based robot skill optimizer. This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions during optimization. Designed for compatibility with any representation of robot skills, MuTT demonstrates its versatility across three comprehensive experiments, showcasing superior performance across two different skill representations.
Related papers
- Multi-Task Interactive Robot Fleet Learning with Visual World Models [25.001148860168477]
Sirius-Fleet is a multi-task interactive robot fleet learning framework.
It monitors robot performance during deployment and involves humans to correct the robot's actions when necessary.
As the robot autonomy improves, anomaly predictors automatically adapt their prediction criteria.
arXiv Detail & Related papers (2024-10-30T04:49:39Z) - DiffGen: Robot Demonstration Generation via Differentiable Physics Simulation, Differentiable Rendering, and Vision-Language Model [72.66465487508556]
DiffGen is a novel framework that integrates differentiable physics simulation, differentiable rendering, and a vision-language model.
It can generate realistic robot demonstrations by minimizing the distance between the embedding of the language instruction and the embedding of the simulated observation.
Experiments demonstrate that with DiffGen, we could efficiently and effectively generate robot data with minimal human effort or training time.
arXiv Detail & Related papers (2024-05-12T15:38:17Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation.
Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z) - Using Knowledge Representation and Task Planning for Robot-agnostic
Skills on the Example of Contact-Rich Wiping Tasks [44.99833362998488]
We show how a single robot skill that utilizes knowledge representation, task planning, and automatic selection of skill implementations can be executed in different contexts.
We demonstrate how the skill-based control platform enables this with contact-rich wiping tasks on different robot systems.
arXiv Detail & Related papers (2023-08-27T21:17:32Z) - Human-Robot Skill Transfer with Enhanced Compliance via Dynamic Movement
Primitives [1.7901837062462316]
We introduce a systematic method to extract the dynamic features from human demonstration to auto-tune the parameters in the Dynamic Movement Primitives framework.
Our method was implemented into an actual human-robot setup to extract human dynamic features and used to regenerate the robot trajectories following both LfD and RL.
arXiv Detail & Related papers (2023-04-12T08:48:28Z) - GLSO: Grammar-guided Latent Space Optimization for Sample-efficient
Robot Design Automation [16.96128900256427]
We present Grammar-guided Latent Space Optimization (GLSO), a framework that transforms design automation into a low-dimensional continuous optimization problem.
In this work, we present a framework that transforms design automation into a low-dimensional continuous optimization problem by training a graph variational autoencoder (VAE) to learn a mapping between the graph-structured design space and a continuous latent space.
arXiv Detail & Related papers (2022-09-23T17:48:24Z) - A Capability and Skill Model for Heterogeneous Autonomous Robots [69.50862982117127]
capability modeling is considered a promising approach to semantically model functions provided by different machines.
This contribution investigates how to apply and extend capability models from manufacturing to the field of autonomous robots.
arXiv Detail & Related papers (2022-09-22T10:13:55Z) - REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy
Transfer [57.045140028275036]
We consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology.
Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal action and/or state distribution being mismatched in different robots.
We propose a novel method named $REvolveR$ of using continuous evolutionary models for robotic policy transfer implemented in a physics simulator.
arXiv Detail & Related papers (2022-02-10T18:50:25Z) - Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models [29.34375999491465]
A core challenge for an autonomous agent acting in the real world is to adapt its repertoire of skills to cope with its noisy perception and dynamics.
To scale learning of skills to long-horizon tasks, robots should be able to learn and later refine their skills in a structured manner.
We proposeSAC-GMM, a novel hybrid approach that learns robot skills through a dynamical system and adapts the learned skills in their own trajectory distribution space.
arXiv Detail & Related papers (2021-11-25T15:36:11Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.