Large Language Models as Minecraft Agents
- URL: http://arxiv.org/abs/2402.08392v1
- Date: Tue, 13 Feb 2024 11:37:30 GMT
- Title: Large Language Models as Minecraft Agents
- Authors: Chris Madge and Massimo Poesio
- Abstract summary: We examine the use of Large Language Models (LLMs) in the challenging setting of acting as a Minecraft agent.
We introduce clarification questions and examine the challenges and opportunities for improvement.
- Score: 6.563602649100242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we examine the use of Large Language Models (LLMs) in the
challenging setting of acting as a Minecraft agent. We apply and evaluate LLMs
in the builder and architect settings, introduce clarification questions and
examining the challenges and opportunities for improvement. In addition, we
present a platform for online interaction with the agents and an evaluation
against previous works.
Related papers
- Large Language Model Agent: A Survey on Methodology, Applications and Challenges [88.3032929492409]
Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.
This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy.
Our work provides a unified architectural perspective, examining how agents are constructed, how they collaborate, and how they evolve over time.
arXiv Detail & Related papers (2025-03-27T12:50:17Z) - From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons [85.99268361356832]
We introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA)
GEA is a single unified model capable of grounding itself across varied domains through a multi-embodiment action tokenizer.
Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents.
arXiv Detail & Related papers (2024-12-11T15:06:25Z) - APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents [8.479128275067742]
We present an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex structures in Minecraft.
By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints.
Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process.
arXiv Detail & Related papers (2024-11-26T09:31:28Z) - GUI Agents with Foundation Models: A Comprehensive Survey [52.991688542729385]
This survey consolidates recent research on (M)LLM-based GUI agents.
We highlight key innovations in data, frameworks, and applications.
We hope this paper will inspire further developments in the field of (M)LLM-based GUI agents.
arXiv Detail & Related papers (2024-11-07T17:28:10Z) - Large Language Model-Based Agents for Software Engineering: A Survey [20.258244647363544]
The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i.e., LLM-based agents.
We collect 106 papers and categorize them from two perspectives, i.e., the SE and agent perspectives.
In addition, we discuss open challenges and future directions in this critical domain.
arXiv Detail & Related papers (2024-09-04T15:59:41Z) - VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents [50.12414817737912]
Large Multimodal Models (LMMs) have ushered in a new era in artificial intelligence, merging capabilities in both language and vision to form highly capable Visual Foundation Agents.
Existing benchmarks fail to sufficiently challenge or showcase the full potential of LMMs in complex, real-world environments.
VisualAgentBench (VAB) is a pioneering benchmark specifically designed to train and evaluate LMMs as visual foundation agents.
arXiv Detail & Related papers (2024-08-12T17:44:17Z) - A LLM Benchmark based on the Minecraft Builder Dialog Agent Task [5.555936227537389]
This work proposes adapting the Minecraft builder task into an LLM benchmark suitable for evaluating LLM ability in spatially orientated tasks.
We believe this approach allows us to probe specific strengths and weaknesses of different agents, and test the ability of LLMs in the challenging area of spatial reasoning and vector based math.
arXiv Detail & Related papers (2024-07-17T16:52:23Z) - Deciphering Digital Detectives: Understanding LLM Behaviors and
Capabilities in Multi-Agent Mystery Games [26.07074182316433]
We introduce the first dataset specifically for Jubensha, including character scripts and game rules.
Our work also presents a unique multi-agent interaction framework using LLMs, allowing AI agents to autonomously engage in this game.
To evaluate the gaming performance of these AI agents, we developed novel methods measuring their mastery of case information and reasoning skills.
arXiv Detail & Related papers (2023-12-01T17:33:57Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Online Advertisements with LLMs: Opportunities and Challenges [51.96140910798771]
This paper explores the potential for leveraging Large Language Models (LLM) in the realm of online advertising systems.
We introduce a general framework for LLM advertisement, consisting of modification, bidding, prediction, and auction modules.
arXiv Detail & Related papers (2023-11-11T02:13:32Z) - TPTU: Large Language Model-based AI Agents for Task Planning and Tool
Usage [28.554981886052953]
Large Language Models (LLMs) have emerged as powerful tools for various real-world applications.
Despite their prowess, intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks.
This paper proposes a structured framework tailored for LLM-based AI Agents.
arXiv Detail & Related papers (2023-08-07T09:22:03Z) - Unlocking the Potential of User Feedback: Leveraging Large Language
Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model.
This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models.
Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z) - Improving Factuality and Reasoning in Language Models through Multiagent
Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv Detail & Related papers (2023-05-23T17:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.