Polycraft World AI Lab (PAL): An Extensible Platform for Evaluating
Artificial Intelligence Agents
- URL: http://arxiv.org/abs/2301.11891v1
- Date: Fri, 27 Jan 2023 18:08:04 GMT
- Title: Polycraft World AI Lab (PAL): An Extensible Platform for Evaluating
Artificial Intelligence Agents
- Authors: Stephen A. Goss, Robert J. Steininger, Dhruv Narayanan, Daniel V.
Oliven\c{c}a, Yutong Sun, Peng Qiu, Jim Amato, Eberhard O. Voit, Walter E.
Voit, Eric J. Kildebeck
- Abstract summary: We present the Polycraft World AI Lab (PAL), a task simulator with an API based on the Minecraft mod Polycraft World.
PAL enables the creation of tasks in a flexible manner as well as having the capability to manipulate any aspect of the task during an evaluation.
In summary, we report a versatile and AI evaluation platform with a low barrier to entry for AI researchers to utilize.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As artificial intelligence research advances, the platforms used to evaluate
AI agents need to adapt and grow to continue to challenge them. We present the
Polycraft World AI Lab (PAL), a task simulator with an API based on the
Minecraft mod Polycraft World. Our platform is built to allow AI agents with
different architectures to easily interact with the Minecraft world, train and
be evaluated in multiple tasks. PAL enables the creation of tasks in a flexible
manner as well as having the capability to manipulate any aspect of the task
during an evaluation. All actions taken by AI agents and external actors
(non-player-characters, NPCs) in the open-world environment are logged to
streamline evaluation. Here we present two custom tasks on the PAL platform,
one focused on multi-step planning and one focused on navigation, and
evaluations of agents solving them. In summary, we report a versatile and
extensible AI evaluation platform with a low barrier to entry for AI
researchers to utilize.
Related papers
- EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment [38.14321677323052]
Embodied artificial intelligence emphasizes the role of an agent's body in generating human-like behaviors.
In this paper, we construct a benchmark platform for embodied intelligence evaluation in real-world city environments.
arXiv Detail & Related papers (2024-10-12T17:49:26Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - Scaling Instructable Agents Across Many Simulated Worlds [70.97268311053328]
Our goal is to develop an agent that can accomplish anything a human can do in any simulated 3D environment.
Our approach focuses on language-driven generality while imposing minimal assumptions.
Our agents interact with environments in real-time using a generic, human-like interface.
arXiv Detail & Related papers (2024-03-13T17:50:32Z) - Toward Human-AI Alignment in Large-Scale Multi-Player Games [24.784173202415687]
We analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games)
We find that while human players exhibit variability in fight-flight and explore-exploit behavior, AI players tend towards uniformity.
These stark differences underscore the need for interpretable evaluation, design, and integration of AI in human-aligned applications.
arXiv Detail & Related papers (2024-02-05T22:55:33Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z) - The MineRL BASALT Competition on Learning from Human Feedback [58.17897225617566]
The MineRL BASALT competition aims to spur forward research on this important class of techniques.
We design a suite of four tasks in Minecraft for which we expect it will be hard to write down hardcoded reward functions.
We provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline.
arXiv Detail & Related papers (2021-07-05T12:18:17Z) - Explainability via Responsibility [0.9645196221785693]
We present an approach to explainable artificial intelligence in which certain training instances are offered to human users.
We evaluate this approach by approximating its ability to provide human users with the explanations of AI agent's actions.
arXiv Detail & Related papers (2020-10-04T20:41:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.