WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- URL: http://arxiv.org/abs/2402.12275v2
- Date: Sun, 26 May 2024 04:24:04 GMT
- Title: WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- Authors: Hao Tang, Darren Key, Kevin Ellis,
- Abstract summary: We give a model-based agent that builds a Python program representing its knowledge of the world based on its interactions with the environment.
We study our agent on gridworlds, and on task planning, finding our approach is more sample-efficient compared to deep RL, more compute-efficient compared to ReAct-style agents, and that it can transfer its knowledge across environments by editing its code.
- Score: 11.81398773711566
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We give a model-based agent that builds a Python program representing its knowledge of the world based on its interactions with the environment. The world model tries to explain its interactions, while also being optimistic about what reward it can achieve. We define this optimism as a logical constraint between a program and a planner. We study our agent on gridworlds, and on task planning, finding our approach is more sample-efficient compared to deep RL, more compute-efficient compared to ReAct-style agents, and that it can transfer its knowledge across environments by editing its code.
Related papers
- AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [19.249596397679856]
We introduce AriGraph, a method wherein the agent constructs a memory graph that integrates semantic and episodic memories while exploring the environment.
This graph structure facilitates efficient associative retrieval of interconnected concepts, relevant to the agent's current state and goals.
We demonstrate that our Ariadne LLM agent, equipped with this proposed memory architecture augmented with planning and decision-making, effectively handles complex tasks on a zero-shot basis in the TextWorld environment.
arXiv Detail & Related papers (2024-07-05T09:06:47Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search [5.913758275518443]
We consider Code World Models, world models generated by a Large Language Model (LLM) in the form of Python code for model-based Reinforcement Learning (RL)
Calling code instead of LLMs for planning has the advantages of being precise, reliable, interpretable, and extremely efficient.
We propose Generate, Improve and Fix with Monte Carlo Tree Search (GIF-MCTS), a new code generation strategy for LLMs.
arXiv Detail & Related papers (2024-05-24T09:31:26Z) - WorldGPT: Empowering LLM as Multimodal World Model [51.243464216500975]
We introduce WorldGPT, a generalist world model built upon Multimodal Large Language Model (MLLM)
WorldGPT acquires an understanding of world dynamics through analyzing millions of videos across various domains.
We conduct evaluations on WorldNet, a multimodal state transition prediction benchmark.
arXiv Detail & Related papers (2024-04-28T14:42:02Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [59.772904419928054]
Large vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
In this paper, we introduce Octopus, a novel VLM designed to proficiently decipher an agent's vision and textual task objectives.
Our design allows the agent to adeptly handle a wide spectrum of tasks, ranging from mundane daily chores in simulators to sophisticated interactions in complex video games.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Thinker: Learning to Plan and Act [18.425843346728648]
The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model.
We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark.
arXiv Detail & Related papers (2023-07-27T16:40:14Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Relational-Grid-World: A Novel Relational Reasoning Environment and An
Agent Model for Relational Information Extraction [0.0]
Reinforcement learning (RL) agents are often designed specifically for a particular problem and they generally have uninterpretable working processes.
Statistical methods-based RL algorithms can be improved in terms of generalizability and interpretability using symbolic Artificial Intelligence (AI) tools such as logic programming.
We present a model-free RL architecture that is supported with explicit relational representations of the environmental objects.
arXiv Detail & Related papers (2020-07-12T11:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.