Related papers: REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

URL: http://arxiv.org/abs/2412.04759v1
Date: Fri, 06 Dec 2024 03:54:55 GMT
Title: REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
Authors: Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman, Insup Lee,
Abstract summary: Building generalist agents that can rapidly adapt to new environments is a key challenge for deploying AI in the digital and real worlds. We propose a novel approach to pre-train relatively small policies on relatively small datasets and adapt them to unseen environments via in-context learning. Our key idea is that retrieval offers a powerful bias for fast adaptation.
Score: 20.826907313227323
License:
Abstract: Building generalist agents that can rapidly adapt to new environments is a key challenge for deploying AI in the digital and real worlds. Is scaling current agent architectures the most effective way to build generalist agents? We propose a novel approach to pre-train relatively small policies on relatively small datasets and adapt them to unseen environments via in-context learning, without any finetuning. Our key idea is that retrieval offers a powerful bias for fast adaptation. Indeed, we demonstrate that even a simple retrieval-based 1-nearest neighbor agent offers a surprisingly strong baseline for today's state-of-the-art generalist agents. From this starting point, we construct a semi-parametric agent, REGENT, that trains a transformer-based policy on sequences of queries and retrieved neighbors. REGENT can generalize to unseen robotics and game-playing environments via retrieval augmentation and in-context learning, achieving this with up to 3x fewer parameters and up to an order-of-magnitude fewer pre-training datapoints, significantly outperforming today's state-of-the-art generalist agents. Website: https://kaustubhsridhar.github.io/regent-research

Related papers

AgentRefine: Enhancing Agent Generalization through Refinement Tuning [28.24897427451803]
Large Language Model (LLM) based agents have proved their ability to perform complex tasks like humans. There is still a large gap between open-sourced LLMs and commercial models like the GPT series. In this paper, we focus on improving the agent generalization capabilities of LLMs via instruction tuning.
arXiv Detail & Related papers (2025-01-03T08:55:19Z)
Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike? [0.0]
We present an approximation that combines agents who think alike', thus reducing the population size and the compute time. Our innovation relies on representing agent behaviors as networks of rules and empirically evaluating different measures of distance between these networks.
arXiv Detail & Related papers (2024-09-01T19:45:15Z)
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z)
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [116.97648507802926]
Large language models (LLMs) are considered a promising foundation to build such agents. We take the first step towards building generally-capable LLM-based agents with self-evolution ability. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration.
arXiv Detail & Related papers (2024-06-06T15:15:41Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment. We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent. We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z)
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning [61.406020873047794]
A major hurdle to real-world application arises from the development of algorithms in an episodic setting. We propose a new method, MEDAL, that trains the backward policy to match the state distribution in the provided demonstrations. Our experiments show that MEDAL matches or outperforms prior methods on three sparse-reward continuous control tasks.
arXiv Detail & Related papers (2022-05-11T00:06:29Z)
Learning Synthetic Environments and Reward Networks for Reinforcement Learning [34.01695320809796]
We introduce Synthetic Environments (SEs) and Reward Networks (RNs) as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment.
arXiv Detail & Related papers (2022-02-06T14:55:59Z)
Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation [44.019674347733506]
We investigate the popular Room-to-Room (R2R) VLN benchmark and discover that what is important is not only the amount of data you synthesize, but also how you do it. We find that shortest path sampling, which is used by both the R2R benchmark and existing augmentation methods, encode biases in the action space of the agent which we dub as action priors. We then show that these action priors offer one explanation toward the poor generalization of existing works.
arXiv Detail & Related papers (2020-03-31T14:52:42Z)
Hierarchically Decoupled Imitation for Morphological Transfer [95.19299356298876]
We show that transferring learned information from a morphologically simpler agent can massively improve the sample efficiency of a more complex one. First, we show that incentivizing a complex agent's low-level to imitate a simpler agent's low-level significantly improves zero-shot high-level transfer. Second, we show that KL-regularized training of the high level stabilizes learning and prevents mode-collapse.
arXiv Detail & Related papers (2020-03-03T18:56:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.