WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- URL: http://arxiv.org/abs/2402.12275v2
- Date: Sun, 26 May 2024 04:24:04 GMT
- Title: WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- Authors: Hao Tang, Darren Key, Kevin Ellis,
- Abstract summary: We give a model-based agent that builds a Python program representing its knowledge of the world based on its interactions with the environment.
We study our agent on gridworlds, and on task planning, finding our approach is more sample-efficient compared to deep RL, more compute-efficient compared to ReAct-style agents, and that it can transfer its knowledge across environments by editing its code.
- Score: 11.81398773711566
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We give a model-based agent that builds a Python program representing its knowledge of the world based on its interactions with the environment. The world model tries to explain its interactions, while also being optimistic about what reward it can achieve. We define this optimism as a logical constraint between a program and a planner. We study our agent on gridworlds, and on task planning, finding our approach is more sample-efficient compared to deep RL, more compute-efficient compared to ReAct-style agents, and that it can transfer its knowledge across environments by editing its code.
Related papers
- Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents [23.1522773245956]
We introduce a novel paradigm that augments language agents with model-based planning.
Our method, WebDreamer, builds on the key insight that LLMs inherently encode comprehensive knowledge about website structures and functionalities.
arXiv Detail & Related papers (2024-11-10T18:50:51Z) - CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building [30.274897468701592]
We present CLIMB, a continual learning framework for robot task planning.
CLIMB builds a model from a natural language description, learn non-obvious predicates while solving tasks, and store that information for future problems.
We also develop the BlocksWorld++ domain, a simulated environment with an easily usable real counterpart, together with a curriculum of tasks with progressing difficulty for evaluating continual learning.
arXiv Detail & Related papers (2024-10-17T16:53:43Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search [5.913758275518443]
We consider Code World Models, world models generated by a Large Language Model (LLM) in the form of Python code for model-based Reinforcement Learning (RL)
Calling code instead of LLMs for planning has potential to be more precise, reliable, interpretable, and extremely efficient.
We show that the Code World Models synthesized with it can be successfully used for planning, resulting in model-based RL agents with greatly improved sample efficiency and inference speed.
arXiv Detail & Related papers (2024-05-24T09:31:26Z) - WorldGPT: Empowering LLM as Multimodal World Model [51.243464216500975]
We introduce WorldGPT, a generalist world model built upon Multimodal Large Language Model (MLLM)
WorldGPT acquires an understanding of world dynamics through analyzing millions of videos across various domains.
We conduct evaluations on WorldNet, a multimodal state transition prediction benchmark.
arXiv Detail & Related papers (2024-04-28T14:42:02Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [58.04529328728999]
Embodied vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation.
Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Thinker: Learning to Plan and Act [18.425843346728648]
The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model.
We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark.
arXiv Detail & Related papers (2023-07-27T16:40:14Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Relational-Grid-World: A Novel Relational Reasoning Environment and An
Agent Model for Relational Information Extraction [0.0]
Reinforcement learning (RL) agents are often designed specifically for a particular problem and they generally have uninterpretable working processes.
Statistical methods-based RL algorithms can be improved in terms of generalizability and interpretability using symbolic Artificial Intelligence (AI) tools such as logic programming.
We present a model-free RL architecture that is supported with explicit relational representations of the environmental objects.
arXiv Detail & Related papers (2020-07-12T11:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.