Target Return Optimizer for Multi-Game Decision Transformer
- URL: http://arxiv.org/abs/2503.02311v1
- Date: Tue, 04 Mar 2025 06:13:53 GMT
- Title: Target Return Optimizer for Multi-Game Decision Transformer
- Authors: Kensuke Tatematsu, Akifumi Wachi,
- Abstract summary: Multi-Game Target Return RL (MTRO) autonomously determines game-specific target returns within the Multi-Game Decision Transformer framework.<n>MTRO does not require additional training, enabling seamless integration into existing Multi-Game Decision Transformer architectures.
- Score: 5.684409853507594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Achieving autonomous agents with robust generalization capabilities across diverse games and tasks remains one of the ultimate goals in AI research. Recent advancements in transformer-based offline reinforcement learning, exemplified by the MultiGame Decision Transformer [Lee et al., 2022], have shown remarkable performance across various games or tasks. However, these approaches depend heavily on human expertise, presenting substantial challenges for practical deployment, particularly in scenarios with limited prior game-specific knowledge. In this paper, we propose an algorithm called Multi-Game Target Return Optimizer (MTRO) to autonomously determine game-specific target returns within the Multi-Game Decision Transformer framework using solely offline datasets. MTRO addresses the existing limitations by automating the target return configuration process, leveraging environmental reward information extracted from offline datasets. Notably, MTRO does not require additional training, enabling seamless integration into existing Multi-Game Decision Transformer architectures. Our experimental evaluations on Atari games demonstrate that MTRO enhances the performance of RL policies across a wide array of games, underscoring its potential to advance the field of autonomous agent development.
Related papers
- Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Advances in Transformers for Robotic Applications: A Review [0.9208007322096533]
We go through recent advances and trends in Transformers in Robotics.<n>We examine their integration into robotic perception, planning, and control for autonomous systems.<n>We discuss how different Transformer variants are being adapted in robotics for reliable planning and perception.
arXiv Detail & Related papers (2024-12-13T23:02:15Z) - AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers [28.927809804613215]
We build upon recent advancements in Transformer-based (in-context) meta-RL.
We evaluate a simple yet scalable solution where both an agent's actor and critic objectives are converted to classification terms.
This design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.
arXiv Detail & Related papers (2024-11-17T22:25:40Z) - Solving Multi-Goal Robotic Tasks with Decision Transformer [0.0]
We introduce a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics.
Our approach integrates goal-specific information into the decision transformer, allowing it to handle complex tasks in an offline setting.
arXiv Detail & Related papers (2024-10-08T20:35:30Z) - Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach [57.15309977293297]
The synergy between Unmanned Aerial Vehicles (UAVs) and metaverses is giving rise to an emerging paradigm named UAV metaverses.
We propose a tiny machine learning-based Stackelberg game framework based on pruning techniques for efficient UT migration in UAV metaverses.
arXiv Detail & Related papers (2024-01-18T02:14:13Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Probing Transfer in Deep Reinforcement Learning without Task Engineering [26.637254541454773]
We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents.
Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway.
We show that zero-shot transfer from the basic games to their variations is possible, but the variance in performance is also largely explained by interactions between factors.
arXiv Detail & Related papers (2022-10-22T13:40:12Z) - Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance.
We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning.
We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z) - Deep Policy Networks for NPC Behaviors that Adapt to Changing Design
Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL)
We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.