PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models
- URL: http://arxiv.org/abs/2402.01118v3
- Date: Tue, 2 Apr 2024 15:46:35 GMT
- Title: PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models
- Authors: Sihao Hu, Tiansheng Huang, Ling Liu,
- Abstract summary: We introduce PokeLLMon, the first LLM-embodied agent that achieves human-parity performance in tactical battle games.
We show that online battles against human demonstrates PokeLLMon's human-like battle strategies and just-in-time decision making.
- Score: 7.653580388741887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce PokeLLMon, the first LLM-embodied agent that achieves human-parity performance in tactical battle games, as demonstrated in Pokemon battles. The design of PokeLLMon incorporates three key strategies: (i) In-context reinforcement learning that instantly consumes text-based feedback derived from battles to iteratively refine the policy; (ii) Knowledge-augmented generation that retrieves external knowledge to counteract hallucination and enables the agent to act timely and properly; (iii) Consistent action generation to mitigate the panic switching phenomenon when the agent faces a powerful opponent and wants to elude the battle. We show that online battles against human demonstrates PokeLLMon's human-like battle strategies and just-in-time decision making, achieving 49% of win rate in the Ladder competitions and 56% of win rate in the invited battles. Our implementation and playable battle logs are available at: https://github.com/git-disl/PokeLLMon.
Related papers
- NitroGen: An Open Foundation Model for Generalist Gaming Agents [101.41866522979548]
NitroGen is a vision-action foundation model for generalist gaming agents.<n>It is trained on 40,000 hours of gameplay videos across more than 1,000 games.
arXiv Detail & Related papers (2026-01-04T16:24:50Z) - Large Language Models as Pokémon Battle Agents: Strategic Play and Content Generation [4.782714372521615]
Pokémon battles demand reasoning about type matchups, statistical trade-offs, and risk assessment.<n>This work examines whether Large Language Models (LLMs) can serve as competent battle agents.<n>We developed a turn-based Pokémon battle system where LLMs select moves based on battle state rather than pre-programmed logic.
arXiv Detail & Related papers (2025-12-19T07:46:29Z) - Mirror Mode in Fire Emblem: Beating Players at their own Game with Imitation and Reinforcement Learning [0.5061396377569701]
This study introduces Mirror Mode, a new game mode where the enemy AI mimics the personal strategy of a player to challenge them to keep changing their gameplay.<n>A simplified version of the Nintendo strategy video game Fire Emblem Heroes has been built in Unity, with a Standard Mode and a Mirror Mode.
arXiv Detail & Related papers (2025-12-10T14:20:02Z) - Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents [56.25101378553328]
We present Game-TARS, a generalist game agent trained with a unified, scalable action space anchored to human-aligned keyboard-mouse inputs.<n>Game-TARS is pre-trained on over 500B tokens with diverse trajectories and multimodal data.<n> Experiments show that Game-TARS achieves about 2 times the success rate over the previous sota model on open-world Minecraft tasks.
arXiv Detail & Related papers (2025-10-27T17:43:51Z) - PokéAI: A Goal-Generating, Battle-Optimizing Multi-agent System for Pokemon Red [4.558478169296784]
We introduce Pok'eAI, the first text-based, multi-agent large language model (LLM) framework designed to autonomously play and progress through Pok'emon Red.<n>Our system consists of three specialized agents-Planning, Execution, and Critique-each with its own memory bank, role, and skill set.
arXiv Detail & Related papers (2025-06-30T10:09:13Z) - Human-Level Competitive Pokémon via Scalable Offline Reinforcement Learning with Transformers [24.201490513370523]
Competitive Pok'emon Singles (CPS) is a popular strategy game where players learn to exploit their opponent based on imperfect information.
We develop a pipeline to reconstruct the first-person perspective of an agent from logs saved from the third-person perspective of a spectator.
This dataset enables a black-box approach where we train large sequence models to adapt to their opponent based solely on their input trajectory.
arXiv Detail & Related papers (2025-04-06T07:35:15Z) - PokéChamp: an Expert-level Minimax Language Agent [17.007111119414745]
We introduce Pok'eChamp, a minimax agent powered by Large Language Models (LLMs) for Pok'emon battles.
Built on a general framework for two-player competitive games, Pok'eChamp leverages the generalist capabilities of LLMs to enhance minimax tree search.
This work compiles the largest real-player Pok'emon battle dataset, featuring over 3 million games, including more than 500k high-Elo matches.
arXiv Detail & Related papers (2025-03-06T05:06:27Z) - Pokemon Red via Reinforcement Learning [3.548348926427221]
Pok'emon Red, a classic Game Boy JRPG, presents significant challenges as a testbed for agents.
We introduce a simplistic environment and a Deep Reinforcement Learning training methodology, demonstrating a baseline agent that completes an initial segment of the game up to completing Cerulean City.
Our experiments include various ablations that reveal vulnerabilities in reward shaping, where agents exploit specific reward signals.
arXiv Detail & Related papers (2025-02-27T09:42:23Z) - Reinforcement Learning for High-Level Strategic Control in Tower Defense Games [47.618236610219554]
In strategy games, one of the most important aspects of game design is maintaining a sense of challenge for players.
We propose an automated approach that combines traditional scripted methods with reinforcement learning.
Results show that combining a learned approach, such as reinforcement learning, with a scripted AI produces a higher-performing and more robust agent than using only AI.
arXiv Detail & Related papers (2024-06-12T08:06:31Z) - All by Myself: Learning Individualized Competitive Behaviour with a
Contrastive Reinforcement Learning optimization [57.615269148301515]
In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time.
We propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them.
Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times.
arXiv Detail & Related papers (2023-10-02T08:11:07Z) - Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - L2E: Learning to Exploit Your Opponent [66.66334543946672]
We propose a novel Learning to Exploit framework for implicit opponent modeling.
L2E acquires the ability to exploit opponents by a few interactions with different opponents during training.
We propose a novel opponent strategy generation algorithm that produces effective opponents for training automatically.
arXiv Detail & Related papers (2021-02-18T14:27:59Z) - Supervised Learning Achieves Human-Level Performance in MOBA Games: A
Case Study of Honor of Kings [37.534249771219926]
We present JueWu-SL, the first supervised-learning-based artificial intelligence (AI) program that achieves human-level performance in online battle arena (MOBA) games.
We integrate the macro-strategy and the micromanagement of MOBA-game-playing into neural networks in a supervised and end-to-end manner.
Tested on Honor of Kings, the most popular MOBA at present, our AI performs competitively at the level of High King players in standard 5v5 games.
arXiv Detail & Related papers (2020-11-25T08:45:55Z) - TotalBotWar: A New Pseudo Real-time Multi-action Game Challenge and
Competition for AI [62.997667081978825]
TotalBotWar is a new pseudo real-time multi-action challenge for game AI.
The game is based on the popular TotalWar games series where players manage an army to defeat the opponent's one.
arXiv Detail & Related papers (2020-09-18T09:13:56Z) - Battlesnake Challenge: A Multi-agent Reinforcement Learning Playground
with Human-in-the-loop [2.9691097886836944]
The Battlesnake Challenge is a framework for multi-agent reinforcement learning with Human-In-the-Loop (HILL)
We develop a simulated game environment for the offline multi-agent model training and identify a set of baselines that can be instilled to improve learning.
Our results show that agents with the proposed HILL consistently outperform agents without HILL.
arXiv Detail & Related papers (2020-07-20T21:59:53Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Deep RL Agent for a Real-Time Action Strategy Game [0.3867363075280543]
We introduce a reinforcement learning environment based on Heroic - Magic Duel, a 1 v 1 action strategy game.
Our main contribution is a deep reinforcement learning agent playing the game at a competitive level.
Our best self-play agent, obtains around $65%$ win rate against the existing AI and over $50%$ win rate against a top human player.
arXiv Detail & Related papers (2020-02-15T01:09:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.