Related papers: Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

URL: http://arxiv.org/abs/2407.10973v1
Date: Mon, 15 Jul 2024 17:59:57 GMT
Title: Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
Authors: Yongyuan Liang, Tingqiang Xu, Kaizhe Hu, Guangqi Jiang, Furong Huang, Huazhe Xu,
Abstract summary: We present Make-An-Agent, a novel policy parameter generator for behavior-to-policy generation. Our generation model demonstrates remarkable versatility and scalability on multiple tasks. We showcase its efficacy and efficiency on various domains and tasks, including varying objectives, behaviors, and even across different robot manipulators.
Score: 41.52811286996212
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Can we generate a control policy for an agent using just one demonstration of desired behaviors as a prompt, as effortlessly as creating an image from a textual description? In this paper, we present Make-An-Agent, a novel policy parameter generator that leverages the power of conditional diffusion models for behavior-to-policy generation. Guided by behavior embeddings that encode trajectory information, our policy generator synthesizes latent parameter representations, which can then be decoded into policy networks. Trained on policy network checkpoints and their corresponding trajectories, our generation model demonstrates remarkable versatility and scalability on multiple tasks and has a strong generalization ability on unseen tasks to output well-performed policies with only few-shot demonstrations as inputs. We showcase its efficacy and efficiency on various domains and tasks, including varying objectives, behaviors, and even across different robot manipulators. Beyond simulation, we directly deploy policies generated by Make-An-Agent onto real-world robots on locomotion tasks.

Related papers

Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks [39.084974125007165]
We introduce Magentic-One, a high-performing open-source agentic system for solving complex tasks. Magentic-One uses a multi-agent architecture where a lead agent, the Orchestrator, tracks progress, and re-plans to recover from errors. We show that Magentic-One achieves statistically competitive performance to the state-of-the-art on three diverse and challenging agentic benchmarks.
arXiv Detail & Related papers (2024-11-07T06:36:19Z)
GRAPPA: Generalizing and Adapting Robot Policies via Online Agentic Guidance [15.774237279917594]
We propose an agentic framework for robot self-guidance and self-improvement. Our framework iteratively grounds a base robot policy to relevant objects in the environment. We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates.
arXiv Detail & Related papers (2024-10-09T02:00:37Z)
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation [65.01527698201956]
Non-autoregressive Transformers (NATs) are able to synthesize images with decent quality in a small number of steps. We propose AdaNAT, a learnable approach that automatically configures a suitable policy tailored for every sample to be generated.
arXiv Detail & Related papers (2024-08-31T03:53:57Z)
Towards Interpretable Foundation Models of Robot Behavior: A Task Specific Policy Generation Approach [1.7205106391379026]
Foundation models are a promising path toward general-purpose and user-friendly robots. In particular, the lack of modularity between tasks means that when model weights are updated, the behavior in other, unrelated tasks may be affected. We present an alternative approach to the design of robot foundation models, which generates stand-alone, task-specific policies.
arXiv Detail & Related papers (2024-07-10T21:55:44Z)
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation [49.03287909942888]
We propose a visuomotor policy learning framework that fine-tunes a video diffusion model on human demonstrations of a given task. We generate an example of an execution of the task conditioned on images of a novel scene, and use this synthesized execution directly to control the robot.
arXiv Detail & Related papers (2024-06-24T17:59:45Z)
Imagination Policy: Using Generative Point Cloud Models for Learning Manipulation Policies [25.760946763103483]
We propose Imagination Policy, a novel multi-task key-frame policy network for solving high-precision pick and place tasks. Instead of learning actions directly, Imagination Policy generates point clouds to imagine desired states which are then translated to actions using rigid action estimation.
arXiv Detail & Related papers (2024-06-17T17:00:41Z)
An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents. Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction. We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z)
Learning to Act from Actionless Videos through Dense Correspondences [87.1243107115642]
We present an approach to construct a video-based robot policy capable of reliably executing diverse tasks across different robots and environments. Our method leverages images as a task-agnostic representation, encoding both the state and action information, and text as a general representation for specifying robot goals. We demonstrate the efficacy of our approach in learning policies on table-top manipulation and navigation tasks.
arXiv Detail & Related papers (2023-10-12T17:59:23Z)
RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking [54.776890150458385]
We develop an efficient system for training universal agents capable of multi-task manipulation skills. We are able to train a single agent capable of 12 unique skills, and demonstrate its generalization over 38 tasks. On average, RoboAgent outperforms prior methods by over 40% in unseen situations.
arXiv Detail & Related papers (2023-09-05T03:14:39Z)
AGPNet -- Autonomous Grading Policy Network [0.5232537118394002]
We formalize the problem as a Markov Decision Process and design a simulation which demonstrates agent-environment interactions. We use methods from reinforcement learning, behavior cloning and contrastive learning to train a hybrid policy. Our trained agent, AGPNet, reaches human-level performance and outperforms current state-of-the-art machine learning methods for the autonomous grading task.
arXiv Detail & Related papers (2021-12-20T21:44:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.